LVS NAT模式下處理FTP流量

概念說明

FTP數據通道的主動模式和被動模式

主動模式:服務端經過指定的數據傳輸端口(默認20),主動鏈接客戶端提交的端口,向客戶端發送數據。
客戶端主動發送:"PORT xxx,xxx,xxx,xxx,ppp,ppp"。等待服務器端發起數據鏈接。
服務器回覆:「200」表示贊成,到此數據通道能夠創建了。服務器

被動模式:服務端採用客戶端建議使用被動模式,開啓數據傳輸端口的監聽,被動等待客戶端的鏈接而後向客戶端發送數據。session

客戶端主動發送:"PASV"。通知服務器端使用被動模式。
服務器回覆:「227 xxx,xxx,xxx,xxx,ppp,ppp」。 表示贊成,同時將本端監聽的端口和IP發送給客戶端。

一言以蔽之,服務端主動鏈接客戶端就是主動模式,服務端被動等待客戶端鏈接(客戶端主動鏈接服務端)就是被動模式。
ftp有主動模式被動模式而ssh等其餘協議沒有這種說法的根源是:ftp要使用別的端口來傳輸數據。app

LVS中對FTP的數據通道的處理

out2in

從上面的概念能夠知道,主動模式客戶端會經過」PORT「命令將本地監聽的端口和IP發送給服務器。因此在out2in方向能夠獲取數據通道的信息。目前只有nat模式須要支持ftp的alg處理。在dnat函數中會解析ftp的內容,找到PORT命令信息,添加數據通道的鏈接跟蹤。ssh

假設咱們從內容中獲取的ip爲dataip,端口爲dataport。控制通道的鏈接跟蹤爲cn,則添加的鏈接跟蹤7元祖爲:tcp

caddr:dataip
cport:dataport
vaddr:cn->vaddr
vport:cn->vport-1(即20)
daddr:cn->daddr
dport:cn->dport-1 (即20)ide

建立該鏈接跟蹤後,會將其狀態設置爲IP_VS_TCP_S_LISTEN,同時設置其超時定時器的時間爲對應的時間。主動模式下不須要進行seq的修正,由於報文不會進行ip地址的變化。函數

in2out

從上面的概念能夠知道,被動模式服務器端會將本地監聽的端口和IP發送給客戶端。因此在in2out方向能夠獲取數據通道的信息。目前只有nat模式須要支持ftp的alg處理。由於是在in2out方向,因此在snat的反向動做dnat中會進行數據通道的鏈接跟蹤處理。ui

假設咱們從內容中獲取的ip爲dataip,端口爲dataport。控制通道的鏈接跟蹤爲cn,則添加的鏈接跟蹤7元祖爲:this

caddr:cn->addr
cport:0
vaddr:cn->vaddr
vport:dataport
daddr:dataip
dport:dataportatom

從上面的鏈接跟蹤信息能夠知道,LVS但願客戶端發送的數據通道的目的IP也是VIP,這與實服務器指定的dataip是不同的。因此須要修改ftp報文中的端口IP信息,讓客戶端發起數據通道鏈接時能命中上面的鏈接跟蹤內容。同時還不知道客戶端會以哪一個端口來鏈接服務器的數據通道,因此鏈接跟蹤中的cport設置爲0,而且設置了標誌IP_VS_CONN_F_NO_CPORT。表示這個鏈接跟蹤須要在肯定的時候設置該cport(這個時候就是客戶端發送鏈接數據通道的syn包的時候,命中了該鏈接跟蹤)。

因爲須要修改應用層信息,那麼涉及到tcp的序列號的變化,LVS想借用netfilter的adjseq機制進行處理,設置了標誌IP_VS_CONN_F_NFCT,表示不要刪除conntrack。

關鍵函數分析

out2in

/*
 * Look at incoming ftp packets to catch the PASV/PORT command
 * (outside-to-inside).
 *
 * The incoming packet having the PORT command should be something like
 *      "PORT xxx,xxx,xxx,xxx,ppp,ppp\n".
 * xxx,xxx,xxx,xxx is the client address, ppp,ppp is the client port number.
 * In this case, we create a connection entry using the client address and
 * port, so that the active ftp data connection from the server can reach
 * the client.
 */
static int ip_vs_ftp_in(struct ip_vs_app *app, struct ip_vs_conn *cp,
            struct sk_buff *skb, int *diff)
{
    struct iphdr *iph;
    struct tcphdr *th;
    char *data, *data_start, *data_limit;
    char *start, *end;
    union nf_inet_addr to;
    __be16 port;
    struct ip_vs_conn *n_cp;

    /* no diff required for incoming packets */
    *diff = 0;

#ifdef CONFIG_IP_VS_IPV6
    /* This application helper doesn't work with IPv6 yet,
     * so turn this into a no-op for IPv6 packets
     */
    if (cp->af == AF_INET6)
        return 1;
#endif

    /* Only useful for established sessions */
    if (cp->state != IP_VS_TCP_S_ESTABLISHED)
        return 1;

    /* Linear packets are much easier to deal with. */
    if (!skb_make_writable(skb, skb->len))
        return 0;

    /*
     * Detecting whether it is passive
     */
    iph = ip_hdr(skb);
    th = (struct tcphdr *)&(((char *)iph)[iph->ihl*4]);

    /* Since there may be OPTIONS in the TCP packet and the HLEN is
       the length of the header in 32-bit multiples, it is accurate
       to calculate data address by th+HLEN*4 */
    data = data_start = (char *)th + (th->doff << 2);
    data_limit = skb_tail_pointer(skb);
    //家產是否爲從模式,6表示"PASV\r\n"的長度,這裏採用了暴力匹配
    while (data <= data_limit - 6) {
        if (strncasecmp(data, "PASV\r\n", 6) == 0) {
            /* Passive mode on */
            IP_VS_DBG(7, "got PASV at %td of %td\n",
                  data - data_start,
                  data_limit - data_start);
            cp->app_data = &ip_vs_ftp_pasv;
            return 1;
        }
        data++;
    }

    /*
     * To support virtual FTP server, the scenerio is as follows:
     *       FTP client ----> Load Balancer ----> FTP server
     * First detect the port number in the application data,
     * then create a new connection entry for the coming data
     * connection.
     * 這種狀況爲主動模式。
     */
    if (ip_vs_ftp_get_addrport(data_start, data_limit,
                   CLIENT_STRING, sizeof(CLIENT_STRING)-1,
                   ' ', '\r', &to.ip, &port,
                   &start, &end) != 1)
        return 1;

    IP_VS_DBG(7, "PORT %pI4:%d detected\n", &to.ip, ntohs(port));

    /* Passive mode off */
    cp->app_data = NULL;

    /*
     * Now update or create a connection entry for it
     */
    IP_VS_DBG(7, "protocol %s %pI4:%d %pI4:%d\n",
          ip_vs_proto_name(iph->protocol),
          &to.ip, ntohs(port), &cp->vaddr.ip, 0);

    {
        struct ip_vs_conn_param p;
        //爲主動模式建立請求方向的鏈接跟蹤,只記錄了
        //主動模式端口號爲20。
        ip_vs_conn_fill_param(cp->ipvs, AF_INET,
                      iph->protocol, &to, port, &cp->vaddr,
                      htons(ntohs(cp->vport)-1), &p);//vport==(vportcp->vport)-1
        n_cp = ip_vs_conn_in_get(&p);
        if (!n_cp) {
            /* This is ipv4 only 使用同一個服務器。*/
            n_cp = ip_vs_conn_new(&p, AF_INET, &cp->daddr,
                          htons(ntohs(cp->dport)-1),//dport==(vportcp->dport)-1
                          IP_VS_CONN_F_NFCT, cp->dest,
                          skb->mark);
            if (!n_cp)
                return 0;

            /* add its controller */
            ip_vs_control_add(n_cp, cp);
        }
    }

    /*
     *    Move tunnel to listen state
     *  設置鏈接跟蹤的狀態爲listen。
     */
    ip_vs_tcp_conn_listen(n_cp);
    ip_vs_conn_put(n_cp);

    return 1;
}

in2out

/*
 * Look at outgoing ftp packets to catch the response to a PASV command
 * from the server (inside-to-outside).
 * When we see one, we build a connection entry with the client address,
 * client port 0 (unknown at the moment), the server address and the
 * server port.  Mark the current connection entry as a control channel
 * of the new entry. All this work is just to make the data connection
 * can be scheduled to the right server later.
 *
 * The outgoing packet should be something like
 *   "227 Entering Passive Mode (xxx,xxx,xxx,xxx,ppp,ppp)".
 * xxx,xxx,xxx,xxx is the server address, ppp,ppp is the server port number.
 */
static int ip_vs_ftp_out(struct ip_vs_app *app, struct ip_vs_conn *cp,
             struct sk_buff *skb, int *diff)
{
    struct iphdr *iph;
    struct tcphdr *th;
    char *data, *data_limit;
    char *start, *end;
    union nf_inet_addr from;
    __be16 port;
    struct ip_vs_conn *n_cp;
    char buf[24];        /* xxx.xxx.xxx.xxx,ppp,ppp\000 */
    unsigned int buf_len;
    int ret = 0;
    enum ip_conntrack_info ctinfo;
    struct nf_conn *ct;

    *diff = 0;

#ifdef CONFIG_IP_VS_IPV6
    /* This application helper doesn't work with IPv6 yet,
     * so turn this into a no-op for IPv6 packets
     */
    if (cp->af == AF_INET6)
        return 1;
#endif

    /* Only useful for established sessions */
    if (cp->state != IP_VS_TCP_S_ESTABLISHED)
        return 1;

    /* Linear packets are much easier to deal with. */
    if (!skb_make_writable(skb, skb->len))
        return 0;
    //被動模式,說明是客戶端發起鏈接,服務器會發送端口和地址
    if (cp->app_data == &ip_vs_ftp_pasv) {//被動模式,端口來自服務器端,須要在out方向獲取端口。
        iph = ip_hdr(skb);
        th = (struct tcphdr *)&(((char *)iph)[iph->ihl*4]);
        data = (char *)th + (th->doff << 2);
        data_limit = skb_tail_pointer(skb);

        if (ip_vs_ftp_get_addrport(data, data_limit,
                       SERVER_STRING,
                       sizeof(SERVER_STRING)-1,
                       '(', ')',
                       &from.ip, &port,
                       &start, &end) != 1)
            return 1;

        IP_VS_DBG(7, "PASV response (%pI4:%d) -> %pI4:%d detected\n",
              &from.ip, ntohs(port), &cp->caddr.ip, 0);

        /*
         * Now update or create an connection entry for it
         * 獲取的服務器端打開的地址和端口
         */
        {
            struct ip_vs_conn_param p;
            ip_vs_conn_fill_param(cp->ipvs, AF_INET,
                          iph->protocol, &from, port,
                          &cp->caddr, 0, &p);//這裏填寫了客戶端的端口爲0
            //查看是否存在輸出的
            n_cp = ip_vs_conn_out_get(&p);
        }
        if (!n_cp) {
            struct ip_vs_conn_param p;
            ip_vs_conn_fill_param(cp->ipvs,
                          AF_INET, IPPROTO_TCP, &cp->caddr,
                          0, &cp->vaddr, port, &p);
            /* As above, this is ipv4 only */
            /* 設置客戶端端口能夠爲0,由於沒有端口 */
            n_cp = ip_vs_conn_new(&p, AF_INET, &from, port,
                          IP_VS_CONN_F_NO_CPORT |
                          IP_VS_CONN_F_NFCT,
                          cp->dest, skb->mark);
            if (!n_cp)
                return 0;

            /* add its controller */
            ip_vs_control_add(n_cp, cp);
        }

        /*
         * Replace the old passive address with the new one
         * 修改報文內容,使用新的ip通知客戶端
         */
        from.ip = n_cp->vaddr.ip;
        port = n_cp->vport;
        snprintf(buf, sizeof(buf), "%u,%u,%u,%u,%u,%u",
             ((unsigned char *)&from.ip)[0],
             ((unsigned char *)&from.ip)[1],
             ((unsigned char *)&from.ip)[2],
             ((unsigned char *)&from.ip)[3],
             ntohs(port) >> 8,
             ntohs(port) & 0xFF);

        buf_len = strlen(buf);
        //使用nf_ct機制進行變換
        ct = nf_ct_get(skb, &ctinfo);
        if (ct) {
            bool mangled;

            /* If mangling fails this function will return 0
             * which will cause the packet to be dropped.
             * Mangling can only fail under memory pressure,
             * hopefully it will succeed on the retransmitted
             * packet.
             * 會涉及seqadjst。
             */
            mangled = nf_nat_mangle_tcp_packet(skb, ct, ctinfo,
                               iph->ihl * 4,
                               start - data,
                               end - start,
                               buf, buf_len);
            if (mangled) {
                ip_vs_nfct_expect_related(skb, ct, n_cp,
                              IPPROTO_TCP, 0, 0);
                if (skb->ip_summed == CHECKSUM_COMPLETE)
                    skb->ip_summed = CHECKSUM_UNNECESSARY;
                /* csum is updated */
                ret = 1;
            }
        }

        /*
         * Not setting 'diff' is intentional, otherwise the sequence
         * would be adjusted twice.
         */

        cp->app_data = NULL;
        //設置鏈接跟蹤的狀態爲listen狀態。
        ip_vs_tcp_conn_listen(n_cp);
        ip_vs_conn_put(n_cp);
        return ret;
    }
    return 1;
}

IP_VS_CONN_F_NO_CPORT

/*
 *    Fill a no_client_port connection with a client port number
 */
void ip_vs_conn_fill_cport(struct ip_vs_conn *cp, __be16 cport)
{
    if (ip_vs_conn_unhash(cp)) {
        spin_lock_bh(&cp->lock);
        if (cp->flags & IP_VS_CONN_F_NO_CPORT) {
            atomic_dec(&ip_vs_conn_no_cport_cnt);
            cp->flags &= ~IP_VS_CONN_F_NO_CPORT;
            cp->cport = cport;
        }
        spin_unlock_bh(&cp->lock);

        /* hash on new dport */
        ip_vs_conn_hash(cp);
    }
}

lvs與nf_ct的關係

在前面咱們分析passive模式的時候,提到了lvs調用ip_vs_nfct_expect_related爲nf-ct添加了一個指望鏈接,而且註冊了help函數。

/*
 * Create NF conntrack expectation with wildcard (optional) source port.
 * Then the default callback function will alter the reply and will confirm
 * the conntrack entry when the first packet comes.
 * Use port 0 to expect connection from any port.
 */
void ip_vs_nfct_expect_related(struct sk_buff *skb, struct nf_conn *ct,
                   struct ip_vs_conn *cp, u_int8_t proto,
                   const __be16 port, int from_rs)
{
    struct nf_conntrack_expect *exp;

    if (ct == NULL)
        return;

    exp = nf_ct_expect_alloc(ct);
    if (!exp)
        return;

    nf_ct_expect_init(exp, NF_CT_EXPECT_CLASS_DEFAULT, nf_ct_l3num(ct),
            from_rs ? &cp->daddr : &cp->caddr,//源IP,若是是實服務器側,主動模式,源IP爲daddr,不然爲caddr
            from_rs ? &cp->caddr : &cp->vaddr,//目的IP,若是是實服務器側,主動模式,目的IP爲caddr,不然爲vaddr
            proto, port ? &port : NULL,
            from_rs ? &cp->cport : &cp->vport);
    //註冊expect函數
    exp->expectfn = ip_vs_nfct_expect_callback;

    IP_VS_DBG(7, "%s: ct=%p, expect tuple=" FMT_TUPLE "\n",
        __func__, ct, ARG_TUPLE(&exp->tuple));
    nf_ct_expect_related(exp);
    nf_ct_expect_put(exp);
}

/*
 * Called from init_conntrack() as expectfn handler.
 * 參數ct爲新的鏈接首包建立的ct。因此首包方向即爲original方向。
 * 在這裏來講。對於active模式下。RS->client爲original方向。
 * passvie模式下,client->RS方向爲original方向。
 */
static void ip_vs_nfct_expect_callback(struct nf_conn *ct,
    struct nf_conntrack_expect *exp)
{
    struct nf_conntrack_tuple *orig, new_reply;
    struct ip_vs_conn *cp;
    struct ip_vs_conn_param p;
    struct net *net = nf_ct_net(ct);

    if (exp->tuple.src.l3num != PF_INET)
        return;

    /*
     * We assume that no NF locks are held before this callback.
     * ip_vs_conn_out_get and ip_vs_conn_in_get should match their
     * expectations even if they use wildcard values, now we provide the
     * actual values from the newly created original conntrack direction.
     * The conntrack is confirmed when packet reaches IPVS hooks.
     */

    /* RS->CLIENT 主動模式 */
    orig = &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple;
    //根據五元組構建lvs的五元組。對於in2out來講,查找lvs鏈接跟蹤的時候,主要匹配
    //目的端口,目的IP爲cp的客戶端IP和端口,匹配sip,sport爲cp的dport和dip
    ip_vs_conn_fill_param(net_ipvs(net), exp->tuple.src.l3num, orig->dst.protonum,
                  &orig->src.u3, orig->src.u.tcp.port,
                  &orig->dst.u3, orig->dst.u.tcp.port, &p);
    cp = ip_vs_conn_out_get(&p);
    if (cp) {
        /* Change reply CLIENT->RS to CLIENT->VS */
        new_reply = ct->tuplehash[IP_CT_DIR_REPLY].tuple;
        IP_VS_DBG(7, "%s: ct=%p, status=0x%lX, tuples=" FMT_TUPLE ", "
              FMT_TUPLE ", found inout cp=" FMT_CONN "\n",
              __func__, ct, ct->status,
              ARG_TUPLE(orig), ARG_TUPLE(&new_reply),
              ARG_CONN(cp));
        //在命中指望鏈接後,nf-ct建立的鏈接跟蹤五元組爲 請求方向:dip,dport,cip,cport
        //應答方向爲cip,cport, dip,dport。而咱們實際須要的是cip,cport, vip,vport。
        //在這裏進行修改。記住,進入到這裏是在prerouting節點的。
        new_reply.dst.u3 = cp->vaddr;
        new_reply.dst.u.tcp.port = cp->vport;
        IP_VS_DBG(7, "%s: ct=%p, new tuples=" FMT_TUPLE ", " FMT_TUPLE
              ", inout cp=" FMT_CONN "\n",
              __func__, ct,
              ARG_TUPLE(orig), ARG_TUPLE(&new_reply),
              ARG_CONN(cp));
        goto alter;
    }

    /* CLIENT->VS 被動模式 */
    /* 獲取請求方向的鏈接跟蹤 */
    cp = ip_vs_conn_in_get(&p);
    if (cp) {
        /* Change reply VS->CLIENT to RS->CLIENT */
        new_reply = ct->tuplehash[IP_CT_DIR_REPLY].tuple;
        IP_VS_DBG(7, "%s: ct=%p, status=0x%lX, tuples=" FMT_TUPLE ", "
              FMT_TUPLE ", found outin cp=" FMT_CONN "\n",
              __func__, ct, ct->status,
              ARG_TUPLE(orig), ARG_TUPLE(&new_reply),
              ARG_CONN(cp));
        //在命中指望鏈接後,nf-ct建立的鏈接跟蹤五元組爲 請求方向:cip,cport, vip,vport
        //應答方向爲vip,vport, cip,cport。而咱們實際須要的是dip,dport, cip,cport。
        //在這裏進行修改。記住,進入到這裏是在prerouting節點的。。
        new_reply.src.u3 = cp->daddr;
        new_reply.src.u.tcp.port = cp->dport;
        IP_VS_DBG(7, "%s: ct=%p, new tuples=" FMT_TUPLE ", "
              FMT_TUPLE ", outin cp=" FMT_CONN "\n",
              __func__, ct,
              ARG_TUPLE(orig), ARG_TUPLE(&new_reply),
              ARG_CONN(cp));
        goto alter;
    }

    IP_VS_DBG(7, "%s: ct=%p, status=0x%lX, tuple=" FMT_TUPLE
          " - unknown expect\n",
          __func__, ct, ct->status, ARG_TUPLE(orig));
    return;

alter:
    /* Never alter conntrack for non-NAT conns */
    /* 只有nat模式纔會有 */
    if (IP_VS_FWD_METHOD(cp) == IP_VS_CONN_F_MASQ)
        nf_conntrack_alter_reply(ct, &new_reply);
    ip_vs_conn_put(cp);
    return;
}
相關文章
相關標籤/搜索