在用elixir 寫 rpc server/client時, 須要對傳入gen_tcp的參數作一些考量. 如, 部分參數應該容許用戶修改, 好比sndbuf recbuf, 讓用戶根據使用場景調節, 部分參數應該屏蔽, 減小使用理解成本.
故, 深挖了一下gen_tcp的optionhtml
文章中貼的文件和行號來源於以下代碼版本node
inet.erl:723linux
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %% Available options for tcp:connect %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% connect_options() -> [tos, tclass, priority, reuseaddr, keepalive, linger, sndbuf, recbuf, nodelay, header, active, packet, packet_size, buffer, mode, deliver, line_delimiter, exit_on_close, high_watermark, low_watermark, high_msgq_watermark, low_msgq_watermark, send_timeout, send_timeout_close, delay_send, raw, show_econnreset, bind_to_device].
type of service
下圖來自tcp ip詳解 卷1
git
IPV6_TCLASS
{tclass, Integer}
Sets IPV6_TCLASS IP level options on platforms where this is implemented.
The behavior and allowed range varies between different systems.
The option is ignored on platforms where it is not implemented. Use with caution.
不知道具體含義, 忽略github
SO_PRIORITY Set the protocol-defined priority for all packets to be sent on this socket. Linux uses this value to order the networking queues: packets with a higher priority may be processed first depending on the selected device queueing discipline. Setting a priority outside the range 0 to 6 requires the CAP_NET_ADMIN capability.
SO_REUSEPORT (since Linux 3.9) Permits multiple AF_INET or AF_INET6 sockets to be bound to an identical socket address. This option must be set on each socket (including the first socket) prior to calling bind(2) on the socket. To prevent port hijacking, all of the pro‐ cesses binding to the same address must have the same effec‐ tive UID. This option can be employed with both TCP and UDP sockets. For TCP sockets, this option allows accept(2) load distribu‐ tion in a multi-threaded server to be improved by using a dis‐ tinct listener socket for each thread. This provides improved load distribution as compared to traditional techniques such using a single accept(2)ing thread that distributes connec‐ tions, or having multiple threads that compete to accept(2) from the same socket. For UDP sockets, the use of this option can provide better distribution of incoming datagrams to multiple processes (or threads) as compared to the traditional technique of having multiple processes compete to receive datagrams on the same socket.
SO_KEEPALIVE Enable sending of keep-alive messages on connection-oriented sockets. Expects an integer boolean flag.
root@1ba6f31f7bc3:/# cat /proc/sys/net/ipv4/tcp_keepalive_time 1800 the interval between the last data packet sent (simple ACKs are not considered data) and the first keepalive probe; after the connection is marked to need keepalive, this counter is not used any furthe root@1ba6f31f7bc3:/# cat /proc/sys/net/ipv4/tcp_keepalive_intvl 75 the interval between subsequential keepalive probes, regardless of what the connection has exchanged in the meantime root@1ba6f31f7bc3:/# cat /proc/sys/net/ipv4/tcp_keepalive_probes 9 the number of unacknowledged probes to send before considering the connection dead and notifying the application layer
SO_LINGER Sets or gets the SO_LINGER option. The argument is a linger structure. struct linger { int l_onoff; /* linger active */ int l_linger; /* how many seconds to linger for */ }; When enabled, a close(2) or shutdown(2) will not return until all queued messages for the socket have been successfully sent or the linger timeout has been reached. Otherwise, the call returns immediately and the closing is done in the background. When the socket is closed as part of exit(2), it always lingers in the background.
close/shutdown前是否等待全部包都送達.緩存
SO_SNDBUF Sets or gets the maximum socket send buffer in bytes. The kernel doubles this value (to allow space for bookkeeping overhead) when it is set using setsockopt(2), and this doubled value is returned by getsockopt(2). The default value is set by the /proc/sys/net/core/wmem_default file and the maximum allowed value is set by the /proc/sys/net/core/wmem_max file. The minimum (doubled) value for this option is 2048. SO_RCVBUF Sets or gets the maximum socket receive buffer in bytes. The kernel doubles this value (to allow space for bookkeeping overhead) when it is set using setsockopt(2), and this doubled value is returned by getsockopt(2). The default value is set by the /proc/sys/net/core/rmem_default file, and the maximum allowed value is set by the /proc/sys/net/core/rmem_max file. The minimum (doubled) value for this option is 256.
inet_drv.c:6708app
case INET_OPT_SNDBUF: { arg.ival= get_int32 (curr); curr += 4; proto = SOL_SOCKET; type = SO_SNDBUF; arg_ptr = (char*) (&arg.ival); arg_sz = sizeof ( arg.ival); /* Adjust the size of the user-level recv buffer, so it's not smaller than the kernel one: */ if (desc->bufsz <= arg.ival) desc->bufsz = arg.ival; break; }
能夠看到, buffer是用戶的緩存, 必定不小於內核buffer, 然而得到的buffer小於 recbuf, sdnbuf.
懷疑: 設置了recvbuf, sndbuf纔會改變buffer.負載均衡
TCP_NODELAYless
DISCUSSION: The Nagle algorithm is generally as follows: If there is unacknowledged data (i.e., SND.NXT > SND.UNA), then the sending TCP buffers all user data (regardless of the PSH bit), until the outstanding data has been acknowledged or until the TCP can send a full-sized segment (Eff.snd.MSS bytes; see Section 4.2.2.6). Some applications (e.g., real-time display window updates) require that the Nagle algorithm be turned off, so small data segments can be streamed out at the maximum rate.
能夠看到和延遲確認一塊兒使用時會帶來很大的延時.異步
http://erlang.org/doc/man/ine...
定長header, 處理定長header時能夠一用.
用被動模式, 異步收發.
包頭長度. 即用多少字節表示包長. raw 等同於 {packet, 0}
包最大長度. 最大容許的包長.
{mode, Mode :: binary | list}
Received Packet is delivered as defined by Mode.
{deliver, port | term}
When {active, true}, data is delivered on the form port : {S, {data, [H1,..Hsz | Data]}} or term : {tcp, S, [H1..Hsz | Data]}.
{line_delimiter, Char}(TCP/IP sockets)
Sets the line delimiting character for line-oriented protocols (line). Defaults to $n.
{exit_on_close, Boolean}
This option is set to true by default.
The only reason to set it to false is if you want to continue sending data to the socket after a close is detected, for example, if the peer uses gen_tcp:shutdown/2 to shut down the write side.
low_msgq_watermark
影響socket busy state的切換.
須要搞清楚幾個問題:
socket busy state是什麼, 譬如調用發送/接收有什麼返回?
msgq data size 和 socket data size, socket data size 是否就是buffer?
發送超時時間, 默認無限等待
發送超時是否自動關閉.
應用層幷包. 默認關閉. 能夠考慮開啓.
是否把RST當正常關閉.
使用指定的設備(網卡)