sicily 1035. DNA matching

時間 2019-11-11

標籤 sicily dna matching 简体版

原文原文鏈接

Description

DNA (Deoxyribonucleic acid) is founded in every living creature as the storage medium for genetic information. It is comprised of subunits called nucleotides that are strung together into polymer chains. DNA polymer chains are more commonly called DNA strands.

There are four kinds of nucleotides in DNA, distinguished by the chemical group, or base attached to it. The four bases are adenine, guanine, cytosine and thymine, abbreviated as A, G, C and T(these letters will be used to refer to nucleotides containing these bases). Single nucleotides are linked together end-to-end to form DNA single strands via chemical reactions. For simplicity, we can use a string composed of letters A, T, C and G to denote a single strand, such as ATTCGAC, but we must also note that the sequence of nucleotides in any strand has a natural orientation, so ATTCGAC and CAGCTTA can not be viewed as identical strands.

DNA does not usually exist in nature as free single strands, though. Under appropriate conditions single strands will pair up and twist around each other, forming the famous double helix structure. This pairing occurs because of a mutual attraction, call hydrogen bonding, that exists between As and Ts, and between Gs and Cs. Hence A/T and G/C are called complementary base pairs.

In the Molecular Biology experiments dealing with DNA, one important process is to match two complementary single strands, and make a DNA double strand. Here we give the constraint that two complementary single strands must have equal length, and the nucleotides in the same position of the two single strands should be complementary pairs. For example, ATTCGAC and TAAGCTG are complementary, but CAGCTTA and TAAGCTG are not, neither are ATTCGAC and GTAAGCT.

As a biology research assistant, your boss has assigned you a job: givn n single strands, find out the maximum number of double strands that could be made (of course each strand can be used at most once). If n is small, of course you can find the answer with the help of pen and paper, however, sometimes n could be quite large… Fortunately you are good at programming and there is a computer in front of you, so you can write a program to help yourself. But you must know that you have many other assignments to finish, and you should not waste too much time here, so, hurry up please!

Input

Input may contain multiple test cases. The first line is a positive integer T(T<=20), indicating the number of test cases followed. In each test case, the first line is a positive integer n(n<=100), denoting the number of single strands below. And n lines follow, each line is a string comprised of four kinds of capital letters, A, T, C and G. The length of each string is no more than 100.

Output

For each test case, the output is one line containing a single integer, the maximum number of double strands that can be formed using those given single strands.

Sample Input

2
3
ATCG
TAGC
TAGG
2
AATT
ATTA

Sample Output

1
0
d

react

其實好像沒什麼技術含量，讀懂題目，理清思路就能夠寫了……ios

題目大意：api

DNA單鏈匹配成雙鏈，A配T，C配G，配過的單鏈不能再用，匹配的單鏈必須等長且每一個位置都匹配，單鏈的方向不能改（CTA不等於ATC）。每一個測試用例給定n條單鏈，問最多能夠匹配出多少雙鏈。數組

爲了方便剪枝，按長度排序，向後查找時離開了等長區間就中止循環。開一個bool數組記錄單鏈是否用過，一個int數組緩存各個字符串大小。由於數據量很小，剪枝+二重循環也能過，複雜度約爲O(n²l)（l爲字符串長度）。緩存

追記：若是用補鏈建字典樹彷佛應該能夠降到O(nl)app

#include<iostream>
#include<cstdio>
#include<string>
#include<algorithm>
using namespace std;

bool cmp(string a, string b) {
    return a.size() > b.size();
}

bool match(string a, string b) {
    int len = a.size();
    for (int i = 0; i < len; ++i) {
        if (a[i] == 'A' && b[i] != 'T')
            return false;
        if (a[i] == 'T' && b[i] != 'A')
            return false;
        if (a[i] == 'C' && b[i] != 'G')
            return false;
        if (a[i] == 'G' && b[i] != 'C')
            return false;
    }
    return true; 
}


int main(void) {
    int t, n;
    string str[101];
    int len[101];
    bool used[101];

    // for each test case: (t<=20)
      cin >> t;
    while(t--) {
        // scan and store n strings(n <= 100)
        cin >> n;
        for (int i = 0; i < n; ++i) {
            cin >> str[i];
        }
        
        // sort by length
        sort(str, str + n, cmp);

        for (int i = 0; i < n; ++i) {
            // store length
            len[i] = str[i].size();
            // init used
            used[i] = false;
        }
        
        int count = 0;
        
        for (int i = 0; i < n; ++i) {
            if (!used[i]) {
                for (int j = i + 1; j < n && len[j] == len[i]; ++j) {
                    if (match(str[i], str[j]) && !used[j]) {
                        count++;
                        used[i] = used[j] = true;
                        break;
                    }
                }
            }
        }
        
        cout << count << '\n';
    }

    return 0;
}

相關標籤/搜索

sicily

dna

matching

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。