【Leetcode】139.拆分詞句

時間 2020-01-26

標籤 Leetcode 拆分詞句简体版

原文原文鏈接

題目

給定一個非空字符串 s 和一個包含非空單詞列表的字典 wordDict，斷定 s 是否能夠被空格拆分爲一個或多個在字典中出現的單詞。java

說明：數組

拆分時能夠重複使用字典中的單詞。
你能夠假設字典中沒有重複的單詞。
示例 1：app

輸入: s = "leetcode", wordDict = ["leet", "code"]
輸出: true
解釋: 返回 true 由於 "leetcode" 能夠被拆分紅 "leet code"。

示例 2：spa

輸入: s = "applepenapple", wordDict = ["apple", "pen"]
輸出: true
解釋: 返回 true 由於 "applepenapple" 能夠被拆分紅 "apple pen apple"。
     注意你能夠重複使用字典中的單詞。

示例 3：code

輸入: s = "catsandog", wordDict = ["cats", "dog", "sand", "and", "cat"]
輸出: false

思路分析

暴力搜索

這道題最開始咱們想的確定是每一個字符遍歷，而後去看是否是在wordDict裏面。而wordDict是一個list，查找是o(N)的時間複雜度，須要把這個時間複雜度先降下來，用Set把每次查找的時間複雜度降到o(1)。blog

怎麼去check一個字符串wordDict能不能被組成，一個很樸素的想法就是把每一個字符串分做兩段，而後遞歸。好比以下代碼。遞歸

public class Solution {
    public boolean wordBreak(String s, List<String> wordDict) {
        return helper(s, new HashSet<>(wordDict), 0);
    }
    public boolean helper(String s, Set<String> wordDict, int start) {
        if (start == s.length()) {
            return true;
        }
        for (int end = start + 1; end <= s.length(); end++) {
            if (wordDict.contains(s.substring(start, end)) 
                && helper(s, wordDict, end)) {
                return true;
            }
        }
        return false;
    }
}

很顯然這種思路是行不通的，由於時間複雜度過高，有興趣的同窗能夠試一下。圖片

時間複雜度: O(n^n)
空間複雜度：O(n)

記憶化搜索

重複計算太多。哪裏重複了？舉個例子：leetcode

輸入: s = "AAAleetcodeB", wordDict = ["leet", "code","A", "AA", "AAA"]
for 循環中：
首次遞歸: s = 'A' + helper('AAleetcodeB'), 最終檢查不符合;
二次遞歸: s = 'AA' + helper('AleetcodeB'), 最終檢查不符合;
三次遞歸: s = 'AAA' + helper('leetcodeB'), 最終檢查不符合;

發現沒, 上面每一次都重複計算了helper('leetcodeB')。字符串

節省時間的辦法也很天然：要是咱們能把搜索過的內容記下來就行了。記憶有兩種辦法可供參考：

動態規劃
記憶化數組進行搜索

動態規劃

咱們先看動態規劃，動態規劃其實很好理解，最重要的是狀態轉移方程。不懂的同窗，能夠手動模擬一遍基本就理解了。

dp[i]表示[0, i] 子串是否可以由wordDict組成
dp[i] = 對於任意j, dp[j] && wordDict 包含 s[j + 1, i]，其中j 屬於區間 [0, i] 。

模擬一下動態規劃的過程是:

輸入: s = "AAAleetcodeB", wordDict = ["leet", "code","A", "AA", "AAA"]
dp[0] = true // 初始化.
首次dp: dp[1] = true, wordDict 包含'A';
二次dp: dp[2] = true, 
        dp[1] = true, wordDict 包含'A';
三次dp: dp[3] = true,
        dp[1] = true, wordDict 包含'AA';
...
最後一次dp: dp[12] = false,
        dp[1]= true wordDict 不包含'AAleetcodeB';
        dp[2]= true wordDict 不包含'AleetcodeB';
        dp[3]= true wordDict 不包含'leetcodeB';
        dp[7]= true wordDict 不包含'codeB';
        dp[11]= true wordDict 不包含'B'
        故,dp[12] = false.

java版本代碼以下:

class Solution {
    public boolean wordBreak(String s, List<String> wordDict) {
        if (s == null) return false;
        Set<String> wordSet = new HashSet<>();
        for (int i = 0; i < wordDict.size(); i++) {
            wordSet.add(wordDict.get(i));
        }
        boolean[] dp = new boolean[s.length() + 1];
        // 狀態開始
        dp[0] = true;
        // dp[i]表示能不能到達第i個字母的時候
        for (int i = 1; i <= s.length(); i++) {
            for (int j = 0; j < i; j++) {
                String current = s.substring(j, i);
                if (dp[j] && wordSet.contains(current)) {
                    dp[i] = true;
                    break;
                }
            }
        }
        return dp[s.length()];
    }
}

時間複雜度：O(n^2)。兩層for循環。
空間複雜度: O(n)。dp數組長度是n。

DFS

動態規劃和記憶化搜索都是很經常使用的解法，本題咱們能夠用一個數組memoToEndContain 記下位置i到字符串結束能不可以由wordDict組成。仍是咱們最開始的例子：

輸入: s = "AAAleetcodeB", wordDict = ["leet", "code","A", "AA", "AAA"]
首次for循環: 'A' 能夠被 wordDict組成
        'AA' 能夠被 wordDict組成
        ...
        'AAAleetcodeB'不能夠被 wordDict組成
        此次深搜後記住：
        從第一個字母'A' 第二個字母'A', 第三個字母'A' ... 開始的子串都不能由wordDict組成;

java代碼：

public class Solution {
    public boolean wordBreak(String s, List<String> wordDict) {
        if (s == null) return false;
        Set<String> wordSet = new HashSet<>(wordDict);
        // 記憶從i到字符串結束能不能搜索到.
        Boolean[] memoToEndContain = new Boolean[s.length()];
        return dfs(s, 0, wordSet, memoToEndContain);
    }

    public boolean dfs(String s,
                       int start,
                       Set<String> wordSet,
                       Boolean[] memoToEndContain) {
        // 搜索到最後.
        if (start == s.length()) {
            return true;
        }
        // 以前已經搜索過.
        if (memoToEndContain[start] != null) {
            return memoToEndContain[start];
        }

        for (int end = start + 1; end <= s.length(); end++) {
            if (wordSet.contains(s.substring(start, end)) 
                && dfs(s, end, wordSet, memoToEndContain)) {
                return memoToEndContain[start] = true;
            }
        }
        memoToEndContain[start] = false;
        return memoToEndContain[start];
    }
}

時間複雜度：O(n^2)。搜索樹的大小最多達到 n^2 。
空間複雜度: O(n)。深度優先二叉搜索樹深度最可能是n。

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。