LeetCode 28：實現strStr() Implement strStr()

時間 2019-12-20

標籤 leetcode 實現 strstr implement 简体版

原文原文鏈接

愛寫bug（ID：icodebugs）
做者：愛寫bughtml

實現 strStr() 函數。java

給定一個 haystack 字符串和一個 needle 字符串，在 haystack 字符串中找出 needle 字符串出現的第一個位置 (從0開始)。若是不存在，則返回 -1。python

Implement strStr().面試

Return the index of the first occurrence of needle in haystack, or -1 if needle is not part of haystack.算法

Example 1:api

Input: haystack = "hello", needle = "ll"
Output: 2

Example 2:數組

Input: haystack = "aaaaa", needle = "bba"
Output: -1

Clarification:oracle

What should we return when needle is an empty string? This is a great question to ask during an interview.函數

For the purpose of this problem, we will return 0 when needle is an empty string. This is consistent to C's strstr() and Java's indexOf()).優化

說明:

當 needle 是空字符串時，咱們應當返回什麼值呢？這是一個在面試中很好的問題。

對於本題而言，當 needle 是空字符串時咱們應當返回 0 。這與C語言的 strstr() 以及 Java的 indexOf()) 定義相符。

解題思路(Java):

暴力窮舉：

複雜度：時間 O(n^2) 空間 O(1)

字符串 a 從第一個索引開始逐一匹配字符串 b 的第一個索引：a[i++]==b[0]，若是爲true，則進入內循環字符串a從第 i+j 個字符開始與字符串b 第 j個字符匹配：a[i+j]==b[j]

代碼：

class Solution {
    public int strStr(String haystack, String needle) {
        if(needle.equals(""))return 0;
        int haystackLen=haystack.length(),needleLen=needle.length();
        char firstChar=needle.charAt(0);

        for(int i=0;i<=haystackLen-needleLen;i++){
            if(haystack.charAt(i)==firstChar){
                int j=1;
                for(;j<needleLen;j++){
                    if(haystack.charAt(i+j)!=needle.charAt(j)) break;
                }
                if(j==needleLen) return i;
            }
        }
        return -1;
    }
}

KMP算法：

複雜度：時間 O(n+m) 空間 O(M)

下面引用一組圖片幫助理解(圖片來源：http://www.javashuo.com/article/p-ncuqifhj-eo.html )：

說明： 圖片中字符串haystack爲："BBC ABCDAB ABCDABCDABDE"，模式串 needle 爲："ABCDABD"

第一步開始匹配：

第二步匹配到第一個相同字符：

第三步兩個字符串逐一貫後匹配，直到到字符 D 與空格字符匹配失敗，結束該輪次匹配：

第四步從新匹配，但不用從第二步的下一個字符 B 開始，由於空格字符前與模式字符串前6個字符已經匹配相同。既C字符以前的兩個字符 AB 與空格字符前兩個字符 AB 相同，兩個字符串可直接從空白字符與 C 字符開始匹配：

能夠看到圖片中一下跳過了 haystack 五個字符ABCDAB 和 needle 的兩個字符AB。優化思路很清晰。

代碼：

class Solution {
    public int strStr(String haystack, String needle) {
        if(needle.equals("")) return 0;
        int[] next = new int[needle.length()];
        getNext(next, needle);// 獲得next數組
        // i是匹配串haystack的指針，j是模式串needle的指針
        int i = 0, j = 0;
        while(i < haystack.length() && j < needle.length()){
            // 若是j=-1，即next數組中該字符爲第一位，下標+1後，從新匹配
            if(j == -1 || haystack.charAt(i) == needle.charAt(j)){
                // 若是匹配成功，則自增1，匹配下一個字符
                i++;j++;
            } else {
                j = next[j];// 若是匹配失敗，則將j賦值next[j]，避免匹配重複匹配
            }
        }
        return j == needle.length() ? i - j : -1;
    }

    private void getNext(int[] next, String needle){
        // k是前綴中相同部分的末尾，也是相同部分的長度
        // j是後綴的末尾，即後綴相同部分的末尾
        int k = -1, j = 0;
        next[0] = -1;
        while(j < needle.length() - 1){
            // 若是k=-1，匹配失敗，從新開始計算前綴和後綴相同的長度
            // 若是兩個字符相等，則在上次前綴和後綴相同的長度加1，繼續下一段字符最大公共先後綴匹配
            if (k == -1 || needle.charAt(j) == needle.charAt(k)){
                k++;j++;
                if (needle.charAt(j) != needle.charAt(k))
                    next[j] = k;
                else
                    //由於不能出現p[j] = p[ next[j ]]，因此當出現時須要繼續遞歸，k = next[k] = next[next[k]]，以減小重複部分的多餘匹配
                    next[j] = next[k];
            } else {
                // 不然，前綴長度縮短爲next[k]
                k = next[k];
            }
        }
    }
}

總結：

KMP算法優化的方向很明瞭，主要難點就在於對next數組的求法和理解，KMP算法不是本文的重點，若有興趣深刻了解，推薦一篇博文：http://www.javashuo.com/article/p-ncuqifhj-eo.html

另外還有Sunday算法 是找到與模式字符串相同長度的源字符串從右向左匹配，其中心思想爲：

若是該字符沒有在模式串中出現，直接從該字符向右移動位數 = 模式串長度 + 1。（由於源字符串含有該字符的相同長度字符串不可能匹配）

若是該字符在模式串中出現過，其移動位數 = 模式串中最右端的該字符到末尾的距離+1。

字符串haystackBBC ABC 與模式串needle ABCDABD 匹配，字符串haystack中的空格字符未在模式串needle 中出現，則能夠直接跳過空格字符後面六個字符的匹配，由於包含空格字符的相同長度字符串都不可能匹配成功，因此能夠跳過6個。

Python3：

說明：上面兩種方法在全部語言均可行，只是語法不一樣，因此在py3中再也不復現，僅展現一些py3特有的語法投機取巧解題。

利用py3內建函數find()直接得結果。

class Solution:
    def strStr(self, haystack: str, needle: str) -> int:
        return haystack.find(needle)

find() 方法描述

find() 方法檢測字符串中是否包含子字符串 str ，若是指定 beg（開始）和 end（結束）範圍，則檢查是否包含在指定範圍內，若是指定範圍內若是包含指定索引值，返回的是索引值在字符串中的起始位置。若是不包含索引值，返回-1。若是子字符串爲空，返回0。

語法
str.find(str, beg=0, end=len(string))
參數

str -- 指定檢索的字符串

beg -- 開始索引，默認爲0。

end -- 結束索引，默認爲字符串的長度。

利用py3字符出切片特性解決：

class Solution:
    def strStr(self, haystack: str, needle: str) -> int:
        for i in range(len(haystack)-len(needle)+1):
            if haystack[i:i+len(needle)]==needle:#截取切片
                return i
        return -1

注：算法導論第32章：字符串匹配有完整的一章相關討論。