主動模式:服務端經過指定的數據傳輸端口(默認20),主動鏈接客戶端提交的端口,向客戶端發送數據。
客戶端主動發送:"PORT xxx,xxx,xxx,xxx,ppp,ppp"。等待服務器端發起數據鏈接。
服務器回覆:「200」表示贊成,到此數據通道能夠創建了。服務器
被動模式:服務端採用客戶端建議使用被動模式,開啓數據傳輸端口的監聽,被動等待客戶端的鏈接而後向客戶端發送數據。session
客戶端主動發送:"PASV"。通知服務器端使用被動模式。 服務器回覆:「227 xxx,xxx,xxx,xxx,ppp,ppp」。 表示贊成,同時將本端監聽的端口和IP發送給客戶端。
一言以蔽之,服務端主動鏈接客戶端就是主動模式,服務端被動等待客戶端鏈接(客戶端主動鏈接服務端)就是被動模式。
ftp有主動模式被動模式而ssh等其餘協議沒有這種說法的根源是:ftp要使用別的端口來傳輸數據。app
從上面的概念能夠知道,主動模式客戶端會經過」PORT「命令將本地監聽的端口和IP發送給服務器。因此在out2in方向能夠獲取數據通道的信息。目前只有nat模式須要支持ftp的alg處理。在dnat函數中會解析ftp的內容,找到PORT命令信息,添加數據通道的鏈接跟蹤。ssh
假設咱們從內容中獲取的ip爲dataip,端口爲dataport。控制通道的鏈接跟蹤爲cn,則添加的鏈接跟蹤7元祖爲:tcp
caddr:dataip
cport:dataport
vaddr:cn->vaddr
vport:cn->vport-1(即20)
daddr:cn->daddr
dport:cn->dport-1 (即20)ide
建立該鏈接跟蹤後,會將其狀態設置爲IP_VS_TCP_S_LISTEN,同時設置其超時定時器的時間爲對應的時間。主動模式下不須要進行seq的修正,由於報文不會進行ip地址的變化。函數
從上面的概念能夠知道,被動模式服務器端會將本地監聽的端口和IP發送給客戶端。因此在in2out方向能夠獲取數據通道的信息。目前只有nat模式須要支持ftp的alg處理。由於是在in2out方向,因此在snat的反向動做dnat中會進行數據通道的鏈接跟蹤處理。ui
假設咱們從內容中獲取的ip爲dataip,端口爲dataport。控制通道的鏈接跟蹤爲cn,則添加的鏈接跟蹤7元祖爲:this
caddr:cn->addr
cport:0
vaddr:cn->vaddr
vport:dataport
daddr:dataip
dport:dataportatom
從上面的鏈接跟蹤信息能夠知道,LVS但願客戶端發送的數據通道的目的IP也是VIP,這與實服務器指定的dataip是不同的。因此須要修改ftp報文中的端口IP信息,讓客戶端發起數據通道鏈接時能命中上面的鏈接跟蹤內容。同時還不知道客戶端會以哪一個端口來鏈接服務器的數據通道,因此鏈接跟蹤中的cport設置爲0,而且設置了標誌IP_VS_CONN_F_NO_CPORT。表示這個鏈接跟蹤須要在肯定的時候設置該cport(這個時候就是客戶端發送鏈接數據通道的syn包的時候,命中了該鏈接跟蹤)。
因爲須要修改應用層信息,那麼涉及到tcp的序列號的變化,LVS想借用netfilter的adjseq機制進行處理,設置了標誌IP_VS_CONN_F_NFCT,表示不要刪除conntrack。
/* * Look at incoming ftp packets to catch the PASV/PORT command * (outside-to-inside). * * The incoming packet having the PORT command should be something like * "PORT xxx,xxx,xxx,xxx,ppp,ppp\n". * xxx,xxx,xxx,xxx is the client address, ppp,ppp is the client port number. * In this case, we create a connection entry using the client address and * port, so that the active ftp data connection from the server can reach * the client. */ static int ip_vs_ftp_in(struct ip_vs_app *app, struct ip_vs_conn *cp, struct sk_buff *skb, int *diff) { struct iphdr *iph; struct tcphdr *th; char *data, *data_start, *data_limit; char *start, *end; union nf_inet_addr to; __be16 port; struct ip_vs_conn *n_cp; /* no diff required for incoming packets */ *diff = 0; #ifdef CONFIG_IP_VS_IPV6 /* This application helper doesn't work with IPv6 yet, * so turn this into a no-op for IPv6 packets */ if (cp->af == AF_INET6) return 1; #endif /* Only useful for established sessions */ if (cp->state != IP_VS_TCP_S_ESTABLISHED) return 1; /* Linear packets are much easier to deal with. */ if (!skb_make_writable(skb, skb->len)) return 0; /* * Detecting whether it is passive */ iph = ip_hdr(skb); th = (struct tcphdr *)&(((char *)iph)[iph->ihl*4]); /* Since there may be OPTIONS in the TCP packet and the HLEN is the length of the header in 32-bit multiples, it is accurate to calculate data address by th+HLEN*4 */ data = data_start = (char *)th + (th->doff << 2); data_limit = skb_tail_pointer(skb); //家產是否爲從模式,6表示"PASV\r\n"的長度,這裏採用了暴力匹配 while (data <= data_limit - 6) { if (strncasecmp(data, "PASV\r\n", 6) == 0) { /* Passive mode on */ IP_VS_DBG(7, "got PASV at %td of %td\n", data - data_start, data_limit - data_start); cp->app_data = &ip_vs_ftp_pasv; return 1; } data++; } /* * To support virtual FTP server, the scenerio is as follows: * FTP client ----> Load Balancer ----> FTP server * First detect the port number in the application data, * then create a new connection entry for the coming data * connection. * 這種狀況爲主動模式。 */ if (ip_vs_ftp_get_addrport(data_start, data_limit, CLIENT_STRING, sizeof(CLIENT_STRING)-1, ' ', '\r', &to.ip, &port, &start, &end) != 1) return 1; IP_VS_DBG(7, "PORT %pI4:%d detected\n", &to.ip, ntohs(port)); /* Passive mode off */ cp->app_data = NULL; /* * Now update or create a connection entry for it */ IP_VS_DBG(7, "protocol %s %pI4:%d %pI4:%d\n", ip_vs_proto_name(iph->protocol), &to.ip, ntohs(port), &cp->vaddr.ip, 0); { struct ip_vs_conn_param p; //爲主動模式建立請求方向的鏈接跟蹤,只記錄了 //主動模式端口號爲20。 ip_vs_conn_fill_param(cp->ipvs, AF_INET, iph->protocol, &to, port, &cp->vaddr, htons(ntohs(cp->vport)-1), &p);//vport==(vportcp->vport)-1 n_cp = ip_vs_conn_in_get(&p); if (!n_cp) { /* This is ipv4 only 使用同一個服務器。*/ n_cp = ip_vs_conn_new(&p, AF_INET, &cp->daddr, htons(ntohs(cp->dport)-1),//dport==(vportcp->dport)-1 IP_VS_CONN_F_NFCT, cp->dest, skb->mark); if (!n_cp) return 0; /* add its controller */ ip_vs_control_add(n_cp, cp); } } /* * Move tunnel to listen state * 設置鏈接跟蹤的狀態爲listen。 */ ip_vs_tcp_conn_listen(n_cp); ip_vs_conn_put(n_cp); return 1; }
/* * Look at outgoing ftp packets to catch the response to a PASV command * from the server (inside-to-outside). * When we see one, we build a connection entry with the client address, * client port 0 (unknown at the moment), the server address and the * server port. Mark the current connection entry as a control channel * of the new entry. All this work is just to make the data connection * can be scheduled to the right server later. * * The outgoing packet should be something like * "227 Entering Passive Mode (xxx,xxx,xxx,xxx,ppp,ppp)". * xxx,xxx,xxx,xxx is the server address, ppp,ppp is the server port number. */ static int ip_vs_ftp_out(struct ip_vs_app *app, struct ip_vs_conn *cp, struct sk_buff *skb, int *diff) { struct iphdr *iph; struct tcphdr *th; char *data, *data_limit; char *start, *end; union nf_inet_addr from; __be16 port; struct ip_vs_conn *n_cp; char buf[24]; /* xxx.xxx.xxx.xxx,ppp,ppp\000 */ unsigned int buf_len; int ret = 0; enum ip_conntrack_info ctinfo; struct nf_conn *ct; *diff = 0; #ifdef CONFIG_IP_VS_IPV6 /* This application helper doesn't work with IPv6 yet, * so turn this into a no-op for IPv6 packets */ if (cp->af == AF_INET6) return 1; #endif /* Only useful for established sessions */ if (cp->state != IP_VS_TCP_S_ESTABLISHED) return 1; /* Linear packets are much easier to deal with. */ if (!skb_make_writable(skb, skb->len)) return 0; //被動模式,說明是客戶端發起鏈接,服務器會發送端口和地址 if (cp->app_data == &ip_vs_ftp_pasv) {//被動模式,端口來自服務器端,須要在out方向獲取端口。 iph = ip_hdr(skb); th = (struct tcphdr *)&(((char *)iph)[iph->ihl*4]); data = (char *)th + (th->doff << 2); data_limit = skb_tail_pointer(skb); if (ip_vs_ftp_get_addrport(data, data_limit, SERVER_STRING, sizeof(SERVER_STRING)-1, '(', ')', &from.ip, &port, &start, &end) != 1) return 1; IP_VS_DBG(7, "PASV response (%pI4:%d) -> %pI4:%d detected\n", &from.ip, ntohs(port), &cp->caddr.ip, 0); /* * Now update or create an connection entry for it * 獲取的服務器端打開的地址和端口 */ { struct ip_vs_conn_param p; ip_vs_conn_fill_param(cp->ipvs, AF_INET, iph->protocol, &from, port, &cp->caddr, 0, &p);//這裏填寫了客戶端的端口爲0 //查看是否存在輸出的 n_cp = ip_vs_conn_out_get(&p); } if (!n_cp) { struct ip_vs_conn_param p; ip_vs_conn_fill_param(cp->ipvs, AF_INET, IPPROTO_TCP, &cp->caddr, 0, &cp->vaddr, port, &p); /* As above, this is ipv4 only */ /* 設置客戶端端口能夠爲0,由於沒有端口 */ n_cp = ip_vs_conn_new(&p, AF_INET, &from, port, IP_VS_CONN_F_NO_CPORT | IP_VS_CONN_F_NFCT, cp->dest, skb->mark); if (!n_cp) return 0; /* add its controller */ ip_vs_control_add(n_cp, cp); } /* * Replace the old passive address with the new one * 修改報文內容,使用新的ip通知客戶端 */ from.ip = n_cp->vaddr.ip; port = n_cp->vport; snprintf(buf, sizeof(buf), "%u,%u,%u,%u,%u,%u", ((unsigned char *)&from.ip)[0], ((unsigned char *)&from.ip)[1], ((unsigned char *)&from.ip)[2], ((unsigned char *)&from.ip)[3], ntohs(port) >> 8, ntohs(port) & 0xFF); buf_len = strlen(buf); //使用nf_ct機制進行變換 ct = nf_ct_get(skb, &ctinfo); if (ct) { bool mangled; /* If mangling fails this function will return 0 * which will cause the packet to be dropped. * Mangling can only fail under memory pressure, * hopefully it will succeed on the retransmitted * packet. * 會涉及seqadjst。 */ mangled = nf_nat_mangle_tcp_packet(skb, ct, ctinfo, iph->ihl * 4, start - data, end - start, buf, buf_len); if (mangled) { ip_vs_nfct_expect_related(skb, ct, n_cp, IPPROTO_TCP, 0, 0); if (skb->ip_summed == CHECKSUM_COMPLETE) skb->ip_summed = CHECKSUM_UNNECESSARY; /* csum is updated */ ret = 1; } } /* * Not setting 'diff' is intentional, otherwise the sequence * would be adjusted twice. */ cp->app_data = NULL; //設置鏈接跟蹤的狀態爲listen狀態。 ip_vs_tcp_conn_listen(n_cp); ip_vs_conn_put(n_cp); return ret; } return 1; }
/* * Fill a no_client_port connection with a client port number */ void ip_vs_conn_fill_cport(struct ip_vs_conn *cp, __be16 cport) { if (ip_vs_conn_unhash(cp)) { spin_lock_bh(&cp->lock); if (cp->flags & IP_VS_CONN_F_NO_CPORT) { atomic_dec(&ip_vs_conn_no_cport_cnt); cp->flags &= ~IP_VS_CONN_F_NO_CPORT; cp->cport = cport; } spin_unlock_bh(&cp->lock); /* hash on new dport */ ip_vs_conn_hash(cp); } }
在前面咱們分析passive模式的時候,提到了lvs調用ip_vs_nfct_expect_related爲nf-ct添加了一個指望鏈接,而且註冊了help函數。
/* * Create NF conntrack expectation with wildcard (optional) source port. * Then the default callback function will alter the reply and will confirm * the conntrack entry when the first packet comes. * Use port 0 to expect connection from any port. */ void ip_vs_nfct_expect_related(struct sk_buff *skb, struct nf_conn *ct, struct ip_vs_conn *cp, u_int8_t proto, const __be16 port, int from_rs) { struct nf_conntrack_expect *exp; if (ct == NULL) return; exp = nf_ct_expect_alloc(ct); if (!exp) return; nf_ct_expect_init(exp, NF_CT_EXPECT_CLASS_DEFAULT, nf_ct_l3num(ct), from_rs ? &cp->daddr : &cp->caddr,//源IP,若是是實服務器側,主動模式,源IP爲daddr,不然爲caddr from_rs ? &cp->caddr : &cp->vaddr,//目的IP,若是是實服務器側,主動模式,目的IP爲caddr,不然爲vaddr proto, port ? &port : NULL, from_rs ? &cp->cport : &cp->vport); //註冊expect函數 exp->expectfn = ip_vs_nfct_expect_callback; IP_VS_DBG(7, "%s: ct=%p, expect tuple=" FMT_TUPLE "\n", __func__, ct, ARG_TUPLE(&exp->tuple)); nf_ct_expect_related(exp); nf_ct_expect_put(exp); } /* * Called from init_conntrack() as expectfn handler. * 參數ct爲新的鏈接首包建立的ct。因此首包方向即爲original方向。 * 在這裏來講。對於active模式下。RS->client爲original方向。 * passvie模式下,client->RS方向爲original方向。 */ static void ip_vs_nfct_expect_callback(struct nf_conn *ct, struct nf_conntrack_expect *exp) { struct nf_conntrack_tuple *orig, new_reply; struct ip_vs_conn *cp; struct ip_vs_conn_param p; struct net *net = nf_ct_net(ct); if (exp->tuple.src.l3num != PF_INET) return; /* * We assume that no NF locks are held before this callback. * ip_vs_conn_out_get and ip_vs_conn_in_get should match their * expectations even if they use wildcard values, now we provide the * actual values from the newly created original conntrack direction. * The conntrack is confirmed when packet reaches IPVS hooks. */ /* RS->CLIENT 主動模式 */ orig = &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple; //根據五元組構建lvs的五元組。對於in2out來講,查找lvs鏈接跟蹤的時候,主要匹配 //目的端口,目的IP爲cp的客戶端IP和端口,匹配sip,sport爲cp的dport和dip ip_vs_conn_fill_param(net_ipvs(net), exp->tuple.src.l3num, orig->dst.protonum, &orig->src.u3, orig->src.u.tcp.port, &orig->dst.u3, orig->dst.u.tcp.port, &p); cp = ip_vs_conn_out_get(&p); if (cp) { /* Change reply CLIENT->RS to CLIENT->VS */ new_reply = ct->tuplehash[IP_CT_DIR_REPLY].tuple; IP_VS_DBG(7, "%s: ct=%p, status=0x%lX, tuples=" FMT_TUPLE ", " FMT_TUPLE ", found inout cp=" FMT_CONN "\n", __func__, ct, ct->status, ARG_TUPLE(orig), ARG_TUPLE(&new_reply), ARG_CONN(cp)); //在命中指望鏈接後,nf-ct建立的鏈接跟蹤五元組爲 請求方向:dip,dport,cip,cport //應答方向爲cip,cport, dip,dport。而咱們實際須要的是cip,cport, vip,vport。 //在這裏進行修改。記住,進入到這裏是在prerouting節點的。 new_reply.dst.u3 = cp->vaddr; new_reply.dst.u.tcp.port = cp->vport; IP_VS_DBG(7, "%s: ct=%p, new tuples=" FMT_TUPLE ", " FMT_TUPLE ", inout cp=" FMT_CONN "\n", __func__, ct, ARG_TUPLE(orig), ARG_TUPLE(&new_reply), ARG_CONN(cp)); goto alter; } /* CLIENT->VS 被動模式 */ /* 獲取請求方向的鏈接跟蹤 */ cp = ip_vs_conn_in_get(&p); if (cp) { /* Change reply VS->CLIENT to RS->CLIENT */ new_reply = ct->tuplehash[IP_CT_DIR_REPLY].tuple; IP_VS_DBG(7, "%s: ct=%p, status=0x%lX, tuples=" FMT_TUPLE ", " FMT_TUPLE ", found outin cp=" FMT_CONN "\n", __func__, ct, ct->status, ARG_TUPLE(orig), ARG_TUPLE(&new_reply), ARG_CONN(cp)); //在命中指望鏈接後,nf-ct建立的鏈接跟蹤五元組爲 請求方向:cip,cport, vip,vport //應答方向爲vip,vport, cip,cport。而咱們實際須要的是dip,dport, cip,cport。 //在這裏進行修改。記住,進入到這裏是在prerouting節點的。。 new_reply.src.u3 = cp->daddr; new_reply.src.u.tcp.port = cp->dport; IP_VS_DBG(7, "%s: ct=%p, new tuples=" FMT_TUPLE ", " FMT_TUPLE ", outin cp=" FMT_CONN "\n", __func__, ct, ARG_TUPLE(orig), ARG_TUPLE(&new_reply), ARG_CONN(cp)); goto alter; } IP_VS_DBG(7, "%s: ct=%p, status=0x%lX, tuple=" FMT_TUPLE " - unknown expect\n", __func__, ct, ct->status, ARG_TUPLE(orig)); return; alter: /* Never alter conntrack for non-NAT conns */ /* 只有nat模式纔會有 */ if (IP_VS_FWD_METHOD(cp) == IP_VS_CONN_F_MASQ) nf_conntrack_alter_reply(ct, &new_reply); ip_vs_conn_put(cp); return; }