Regular Expression Matching
java
My Submissionsnode
Questiongit
Solution github
Implement regular expression matching with support for '.' and '*'.express
'.' Matches any single character.ide
'*' Matches zero or more of the preceding element.spa
The matching should cover the entire input string (not partial)..net
The function prototype should be:prototype
bool isMatch(const char *s, const char *p)code
Some examples:
isMatch("aa","a") → false
isMatch("aa","aa") → true
isMatch("aaa","aa") → false
isMatch("aa", "a*") → true
isMatch("aa", ".*") → true
isMatch("ab", ".*") → true
isMatch("aab", "c*a*b") → true
Hide Tags Dynamic Programming Backtracking String
SOLUTION1:
總的來講思路是遞歸。
判斷下一個字符是不是*:
若是不是*,則判斷當前字符是否匹配。
若是是*,則由於不能肯定*到底會匹配幾個,在當前字符匹配的前提下,要枚舉全部的狀況,從假設匹配0個,1個,2個。。。只要有一種狀況成功了,最終也就成功了。
咱們能夠從0開始,先考慮直接跳過當前2個正則字符,而後再1個,2個繼續搜索下去。
若是是*,可是當前字符不匹配,則跳過兩個遞歸。
具體的代碼以下,註釋寫得很清楚。
ref: http://blog.csdn.net/fightforyourdream/article/details/17717873
1 package Algorithms; 2 3 public class IsMach { 4 public static void main(String[] str) { 5 //System.out.println(isMatch("aa", "aa")); 6 System.out.println(isMatch("aab", "c*a*b")); 7 } 8 9 public static boolean isMatch(String s, String p) { 10 if (s == null || p == null) { 11 return false; 12 } 13 14 return help(s, p, 0, 0); 15 } 16 17 public static boolean help(String s, String p, int indexS, int indexP) { 18 int pLen = p.length(); 19 int sLen = s.length(); 20 21 // 1. P結束了,這時 S也應該要結束 22 if (indexP == pLen) { 23 return indexS == sLen; 24 } 25 26 // 2. P 只有最後一個沒有匹配 27 if (indexP == pLen - 1) { 28 // 必須相等,或者是p爲'.'. 29 // S必須只有一個字符 30 return indexS == sLen - 1 && matchChar(s, p, indexS, indexP); 31 } 32 33 // 如下P 至少還有2個字符. 34 35 // 2. 單獨匹配的狀況, 如 aa, a. 相似這樣 36 if (p.charAt(indexP + 1) != '*') { 37 if (indexS < sLen && matchChar(s, p, indexS, indexP)) { 38 return help(s, p, indexS + 1, indexP + 1); // p能夠前進一格 39 } else { 40 return false; 41 } 42 } 43 44 // 3. 多重匹配的狀況, 如 .* or a* ,這時須要進行遞歸 45 46 // 先直接跳過此2個正則,由於咱們能夠匹配空。 47 if (help(s, p, indexS, indexP + 2)) { 48 return true; 49 } 50 51 // 匹配非空的狀況,這裏不能夠跳過p,必須 匹配1個或是多個 52 for (int i = indexS; i < sLen; i++) { 53 if (!matchChar(s, p, i, indexP)) { 54 return false; 55 } else { 56 if (help(s, p, i + 1, indexP + 2)) { 57 return true; 58 } 59 } 60 } 61 62 // 多重匹配以後,餘下的字串仍然不能夠匹配,則返回失敗。 63 return false; 64 } 65 66 // check if the s match p in the index. 67 public static boolean matchChar(String s, String p, int indexS, int indexP) { 68 return (s.charAt(indexS) == p.charAt(indexP)) || p.charAt(indexP) == '.'; 69 } 70 }
SOLUTION2:
稍微重寫了一下,思路沒有什麼大的變化,可是簡化了一點點:
咱們只須要判斷2種狀況:
1. 下一個是*的狀況,這個時候不須要考慮S長度。由於S爲空也是能夠的。
2. 下一個不是*,這個統一考慮,當前s必須留下至少一個字符,若是有,繼續遞歸便可。
1 public class Solution { 2 public boolean isMatch(String s, String p) { 3 if (s == null || p == null) { 4 return false; 5 } 6 7 return isMatchRec(s, p, 0, 0); 8 } 9 10 public boolean isMatchRec(String s, String p, int indexS, int indexP) { 11 int lenS = s.length(); 12 int lenP = p.length(); 13 14 // we get to the end of the string. 15 if (indexP == lenP) { 16 return indexS == lenS; 17 } 18 19 // At lease 2 match character left 20 if (indexP < lenP - 1 && p.charAt(indexP + 1) == '*') { 21 // match 0; 22 if (isMatchRec(s, p, indexS, indexP + 2)) { 23 return true; 24 } 25 26 // we can match 0 or more. 27 for (int i = indexS; i < lenS; i++) { 28 // match once or more. 29 if (!isMatchChar(s.charAt(i), p.charAt(indexP))) { 30 return false; 31 } 32 33 if (isMatchRec(s, p, i + 1, indexP + 2)) { 34 return true; 35 } 36 } 37 38 // if any of them does not match, just return false. 39 return false; 40 } 41 42 // match current character and the left string. 43 return indexS < lenS 44 && isMatchChar(s.charAt(indexS), p.charAt(indexP)) 45 && isMatchRec(s, p, indexS + 1, indexP + 1); 46 } 47 48 public boolean isMatchChar(char s, char p) { 49 if (p == '*') { 50 return false; 51 } 52 53 if (s == p || p == '.') { 54 return true; 55 } 56 57 return false; 58 } 59 60 }
2014.12.28 Redo:
1 public boolean isMatch(String s, String p) {
2 if (s == null || p == null) {
3 return false;
4 }
5
6 return dfs(s, p, 0, 0);
7 }
8
9 public boolean dfs(String s, String p, int indexS, int indexP) {
10 int lenS = s.length();
11 int lenP = p.length();
12
13 // THE BASE CASE:
14 if (indexP >= lenP) {
15 // indexP is out of range. Then the s should also be empty.
16 return indexS >= lenS;
17 }
18
19 // The first Case: next node is *
20 if (indexP != lenP - 1 && p.charAt(indexP + 1) == '*') {
21 // p can skip 2 node, and the S can skip 0 or more characters.
22 if (dfs(s, p, indexS, indexP + 2)) {
23 return true;
24 }
25
26 for (int i = indexS; i < lenS; i++) {
27 // the char is not equal.
28 // bug 2: Line 31: java.lang.StringIndexOutOfBoundsException: String index out of range: -1
29 if (!isSame(s.charAt(i), p.charAt(indexP))) {
30 return false;
31 }
32
33 if (dfs(s, p, i + 1, indexP + 2)) {
34 return true;
35 }
36 }
37
38 // Not any of them can match.
39 return false;
40 } else {
41 // S should have at least one character left.
42 if (indexS >= lenS) {
43 return false;
44 }
45
46 if (!isSame(s.charAt(indexS), p.charAt(indexP))) {
47 return false;
48 }
49
50 // bug 1: forget ';'
51 return dfs(s, p, indexS + 1, indexP + 1);
52 }
53 }
54
55 public boolean isSame(char c, char p) {
56 return p == '.' || c == p;
57 }
時間複雜度: 2^N
由於,假設P全是a*a*a*這樣組成,s = aaaaaaaa 而s的每個字符都有2種可能:與當前的a*匹配,或者與下一個a*匹配(前一個匹配空),這樣假設
s有n個字符,則實際上的複雜度是2^N.
從下是RUNTIME:
SOLUTION3:
記憶化搜索,在SOLUTION 2的基礎上,加上記憶矩陣。複雜度爲M*N*M。
最後一個m是遇到*時,須要遍歷一次string。
1 // solution2: dfs + memory 2 public boolean isMatch(String s, String p) { 3 if (s == null || p == null) { 4 return false; 5 } 6 7 int[][] mem = new int[s.length() + 1][p.length() + 1]; 8 9 // BUG 1: forget to init the memory array. 10 // BUG 2: the corner is <= 11 for (int i = 0; i <= s.length(); i++) { 12 for (int j = 0; j <= p.length(); j++) { 13 mem[i][j] = -1; 14 } 15 } 16 17 return dfsMem(s, p, 0, 0, mem); 18 } 19 20 public boolean dfsMem(String s, String p, int indexS, int indexP, int[][] mem) { 21 int lenS = s.length(); 22 int lenP = p.length(); 23 24 if (mem[indexS][indexP] != -1) { 25 return mem[indexS][indexP] == 1; 26 } 27 28 // THE BASE CASE: 29 if (indexP >= lenP) { 30 // indexP is out of range. Then the s should also be empty. 31 mem[indexS][indexP] = indexS >= lenS ? 1: 0; 32 return indexS >= lenS; 33 } 34 35 // The first Case: next node is * 36 if (indexP != lenP - 1 && p.charAt(indexP + 1) == '*') { 37 // p can skip 2 node, and the S can skip 0 or more characters. 38 if (dfsMem(s, p, indexS, indexP + 2, mem)) { 39 mem[indexS][indexP] = 1; 40 return true; 41 } 42 43 for (int i = indexS; i < lenS; i++) { 44 // the char is not equal. 45 // bug 2: Line 31: java.lang.StringIndexOutOfBoundsException: String index out of range: -1 46 if (!isSame(s.charAt(i), p.charAt(indexP))) { 47 mem[indexS][indexP] = 0; 48 return false; 49 } 50 51 if (dfsMem(s, p, i + 1, indexP + 2, mem)) { 52 mem[indexS][indexP] = 1; 53 return true; 54 } 55 } 56 57 // Not any of them can match. 58 mem[indexS][indexP] = 0; 59 return false; 60 } else { 61 // S should have at least one character left. 62 boolean ret = indexS < lenS 63 && isSame(s.charAt(indexS), p.charAt(indexP)) 64 && dfsMem(s, p, indexS + 1, indexP + 1, mem); 65 66 mem[indexS][indexP] = ret ? 1: 0; 67 return ret; 68 } 69 }
SOLUTION 4:
DP:
D[i][j]: 表示string s中取i長度的字串,string p中取j長度字串,進行匹配。
狀態轉移:
1. j >= 2 && P[j - 1] = *,這時,咱們能夠選擇匹配s中的空字串,或匹配無限個。
k: 在s中匹配的字符的個數
因此轉移式是:D[i][j] = D[i - k][j - 2] && isSame(s.charAt(i - k), p.charAt(j - 2)) (k: 1-i)
D[i - k][j - 2] (k = 0)
2. p最後一個字符不是*
那麼首先,s中至少還要有一個字符,而後再匹配一個字符,以及上一級也要匹配便可。
D[i][j] = i >= 1
&& isSame(s.charAt(i - 1), p.charAt(j - 1))
&& D[i - 1][j - 1];
3. j = 0;
D[i][j] = i == 0; (p爲空,則s也是要爲空才能夠匹配)
如下是運行時間(LEETCODE這道題目的數據太弱了... orz),看不出太大的區別。
1 // solution4: DP 2 public boolean isMatch(String s, String p) { 3 if (s == null || p == null) { 4 return false; 5 } 6 7 // bug 2: should use boolean instead of int. 8 boolean[][] D = new boolean[s.length() + 1][p.length() + 1]; 9 10 // D[i][j]: i, j, the length of String s and String p. 11 for (int i = 0; i <= s.length(); i++) { 12 for (int j = 0; j <= p.length(); j++) { 13 if (j == 0) { 14 // when p is empth, the s should be empty. 15 D[i][j] = i == 0; 16 } else if (p.charAt(j - 1) == '*') { 17 /* 18 P has at least one node. 19 */ 20 21 // The last node in p is '*' 22 if (j < 2) { 23 // a error: there should be a character before *. 24 //return false; 25 } 26 27 // we can match 0 characters or match more characters. 28 for (int k = 0; k <= i; k++) { 29 // BUG 3: severe! Forget to deal with the empty string!! 30 if (k != 0 && !isSame(s.charAt(i - k), p.charAt(j - 2))) { 31 D[i][j] = false; 32 break; 33 } 34 35 if (D[i - k][j - 2]) { 36 D[i][j] = true; 37 break; 38 } 39 } 40 } else { 41 D[i][j] = i >= 1 42 && isSame(s.charAt(i - 1), p.charAt(j - 1)) 43 && D[i - 1][j - 1]; 44 } 45 } 46 } 47 48 return D[s.length()][p.length()]; 49 }
SOLUTION 5(DP):
Date: Sep 14, 2017
簡化了DP的邏輯:
D[i][j]:
1. If i == 0 => i==0
2. if (j>=2 && p.charAt(j-1) == '*') => D[i][j-2] || (i>=1 && s[i-1] == p[j-2] && D[i-1][j])
3. s[i-1] == p[j-1] && D[i-1][j-1]
1 class Solution { 2 public boolean isMatch(String s, String p) { 3 int lenS = s.length(); 4 int lenP = p.length(); 5 6 boolean[][] D = new boolean[lenS+1][lenP+1]; 7 8 for (int i = 0; i <= lenS; i++) { 9 for (int j = 0; j <= lenP; j++) { 10 if (j == 0) { 11 D[i][j] = i == 0; 12 } else if (j >= 2 && p.charAt(j-1) == '*') { 13 D[i][j] = D[i][j - 2] || 14 (i >= 1 && isEqual(s.charAt(i-1), p.charAt(j-2)) && D[i-1][j]); 15 } else { 16 D[i][j] = i > 0 && isEqual(s.charAt(i-1), p.charAt(j-1)) && D[i-1][j-1]; 17 } 18 } 19 } 20 21 return D[lenS][lenP]; 22 } 23 24 public boolean isEqual(char s, char p) { 25 return p == '.' || s == p; 26 } 27 }
GitHub:
https://github.com/yuzhangcmu/LeetCode_algorithm/blob/master/string/isMatch.java
https://github.com/yuzhangcmu/LeetCode_algorithm/blob/master/string/isMatch_2014_1228.java