BZOJ4779: [Usaco2017 Open]Bovine Genomics

時間 2019-11-13

標籤 bzoj4779 bzoj usaco2017 usaco open bovine genomics 简体版

原文原文鏈接

題目描述

Farmer John owns Ncows with spots and N cows without spots. Having just completed a course in bovinec++

genetics, he is convinced that the spots on his cows are caused by mutations in the bovine genome.Aui

t great expense, Farmer John sequences the genomes of his cows. Each genome is a string of length Mbthis

uilt from the four characters A, C, G, and T. When he lines up the genomes of his cows, he gets a taspa

ble like the following, shown here for N=3 and M=8:code

Positions: 1 2 3 4 5 6 7 8ci

Spotty Cow 1: A A T C C C A T字符串

Spotty Cow 2: A C T T G C A Aget

Spotty Cow 3: G G T C G C A Ainput

Plain Cow 1: A C T C C C A Gstring

Plain Cow 2: A C T C G C A T

Plain Cow 3: A C T T C C A T

Looking carefully at this table, he surmises that the sequence from position 2 through position 5 is

sufficient to explain spottiness. That is, by looking at the characters in just these these positio

ns (that is, positions 2…5), Farmer John can predict which of his cows are spotty and which are not

. For example, if he sees the characters GTCG in these locations, he knows the cow must be spotty.Pl

ease help FJ find the length of the shortest sequence of positions that can explain spottiness.

給定n個A串和n個B串，長度均爲m，求一個最短的區間[l,r]

使得不存在一個A串a和一個B串b，使得a[l,r]=b[l,r]

n,m≤500

輸入格式

The first line of input contains N(1≤N≤500) and M (3≤M≤500). The next N lines each contain a str

ing of M characters; these describe the genomes of the spotty cows. The final Nlines describe the ge

nomes of the plain cows. No spotty cow has the same exact genome as a plain cow.

輸出格式

Please print the length of the shortest sequence of positions that is sufficient to explain spottine

ss. A sequence of positions explains spottiness if the spottiness trait can be predicted with perfec

t accuracy among Farmer John's population of cows by looking at just those locations in the genome.

樣例輸入

3 8
AATCCCAT
ACTTGCAA
GGTCGCAA
ACTCCCAG
ACTCGCAT
ACTTCCAT

樣例輸出

提示

沒有寫明提示

題目來源

Gold

題解

個人作法是\(O(nmlog^2n)\)的。

先把字符串hash掉，而後這個判斷可行一看就知道是能夠二分的。那就二分一波答案。判斷那裏，考慮用set來維護相同hash值。

枚舉長度爲x（二分的值）的區間，而後將A串裏面這個區間的hash值塞進set裏面。對每一個B串在set裏面find一下這個字串有沒有出現過便可。

#include <bits/stdc++.h>
#define ll long long
#define inf 0x3f3f3f3f
#define il inline
#define ull unsigned long long

namespace io {

#define in(a) a = read()
#define out(a) write(a)
#define outn(a) out(a), putchar('\n')

#define I_int ll
inline I_int read() {
    I_int x = 0, f = 1;
    char c = getchar();
    while (c < '0' || c > '9') {
        if (c == '-') f = -1;
        c = getchar();
    }
    while (c >= '0' && c <= '9') {
        x = x * 10 + c - '0';
        c = getchar();
    }
    return x * f;
}
char F[200];
inline void write(I_int x) {
    if (x == 0) return (void) (putchar('0'));
    I_int tmp = x > 0 ? x : -x;
    if (x < 0) putchar('-');
    int cnt = 0;
    while (tmp > 0) {
        F[cnt++] = tmp % 10 + '0';
        tmp /= 10;
    }
    while (cnt > 0) putchar(F[--cnt]);
}
#undef I_int

}
using namespace io;

using namespace std;

#define N 510
#define base 13131

int n = read(), m = read();
char s[N][N], t[N][N];
ull h1[N][N], h2[N][N], p[N];
set<ull>S;

ull get(ull *h, int l, int r) {
    return h[r] - h[l-1] * p[r-l+1];
}

bool check(int x) {
    bool ans = 0;
    for(int l = 1; l + x - 1 <= m; ++l) {
        int r = l + x - 1, flag = 0;
        S.clear();
        for(int i = 1; i <= n; ++i) {
            S.insert(get(h1[i], l, r));
        }
        for(int i = 1; i <= n; ++i) {
            if(S.find(get(h2[i], l, r)) != S.end()) {
                flag = 1;
                break;
            }
        }
        if(!flag) {
            ans = 1;
            break;
        }
    }
    return ans;
}

int main() { 
    for(int i = 1; i <= n; ++i) scanf("%s",s[i]+1);
    for(int i = 1; i <= n; ++i) scanf("%s",t[i]+1);
    p[0] = 1;
    for(int i = 1; i <= m; ++i) p[i] = p[i - 1] * base;
    for(int i = 1; i <= n; ++i) {
        for(int j = 1; j <= m; ++j) h1[i][j] = h1[i][j-1]*base+(ull)s[i][j];
        for(int j = 1; j <= m; ++j) h2[i][j] = h2[i][j-1]*base+(ull)t[i][j]; 
    }
    int l = 1, r = m, ans = m;
    while(l <= r) {
        int mid = (l + r) >> 1;
        if(check(mid)) ans = mid, r = mid - 1;
        else l = mid + 1;
    }
    outn(ans);
    return 0;
}

相關標籤/搜索

usaco2017

genomics

bzoj4779

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。