javascript 正則 regexp

時間 2019-11-20

標籤 javascript 正則 regexp 欄目 JavaScript 简体版

原文原文鏈接

正則轉義字符：
		Table 10-2. Regular expression character classes
Character Matches
[...] Any one character between the brackets.
[^...] Any one character not between the brackets.
. Any character except newline or another Unicode line terminator.
\w Any ASCII word character. Equivalent to [a-zA-Z0-9_].
\W Any character that is not an ASCII word character. Equivalent to [^a-zA-Z0-9_].
\s Any Unicode whitespace character.
\S Any character that is not Unicode whitespace. Note that \w and \S are not the same thing.
\d Any ASCII digit. Equivalent to [0-9].
\D Any character other than an ASCII digit. Equivalent to [^0-9].
[\b] A literal backspace (special case).
	匹配次數：
Character Meaning
{ n , m } Match the previous item at least n times but no more than m times.
{ n ,} Match the previous item n or more times.
{ n } Match exactly n occurrences of the previous item.
? Match zero or one occurrences of the previous item. That is, the previous item is optional. Equivalent to {0,1}.
+ Match one or more occurrences of the previous item. Equivalent to {1,}.
* Match zero or more occurrences of the previous item. Equivalent to {0,}.
非貪婪模式：
		在匹配次數後面加個問號'?'，儘可能按少的次數匹配

var test='aaaca';
console.log(test.replace(/a+/,'b'));     //=>bca    貪婪非全局匹配
console.log(test.replace(/a+?/,'b'));    //=>baaca  非貪婪非全局匹配
console.log(test.replace(/a+/g,'b'));    //=>bcb    貪婪全局匹配
console.log(test.replace(/a+?/g,'b'));   //=>bbbcb  非貪婪全局匹配 ,開啓g標誌，屢次以非貪婪模式匹配

注意：正則匹配儘可能找出第一個匹配的位置。所以因爲若在首個字符能匹配的狀況下，在後來的字符序列進行更短的匹配將不予考慮。

var test='aaaca';
console.log(test.replace(/a+c/,'b'));     //=>ba    貪婪非全局匹配
console.log(test.replace(/a+?c/,'b'));    //=>ba    非貪婪非全局匹配
console.log(test.replace(/a+c/g,'b'));    //=>ba    貪婪全局匹配
console.log(test.replace(/a+?c/g,'b'));   //=>ba    非貪婪全局匹配

間隔，分組，引用
　　間隔：| 有點像if elseif,可是若是加了g，就至關於if  if  if

var s='abcdef';
console.log(s.replace(/ab|cd|ef/,'n'));  //單次 只匹配一個  ncdef
console.log(s.replace(/ab|cd|ef/g,'n')); //全局匹配 匹配3個     nnn

　　分組:()
	1.將元素用小括號包起來，這樣就能夠做爲一個單元看待，可被用於 |，*，+，？
	2.匹配子模式，而且順序是按照左邊括號的順序。
		/([Jj]ava([Ss]cript)?)\sis\s(fun\w*)/ 中 \2 引用的是 ([Ss]cript)
		/([Jj]ava(?:[Ss]cript)?)\sis\s(fun\w*)/   // (?:) 使之不產生引用(即不可回溯)，此時 \2 引用的是(fun\w*)
		/['"][^'"]*['"]/    //匹配被帶引號的字符串，可是先後可能一個單引號，另外一個是雙引號
		/(['"])[^'"]*\1/    //先後的單引號或者雙引號匹配
		/(['"])[^\1]*\1/    //錯誤的寫法，不能將引用寫在分類內部
間隔，分組，引用總結：
Regular expression alternation, grouping, and reference characters
Character Meaning
| 		Alternation. Match either the subexpression to the left or the subexpression to the right.
(...) 		Grouping. Group items into a single unit that can be used with *, +, ?, |, and so on. Also remember the characters
		that match this group for use with later references.
(？:...) 	Grouping only. Group items into a single unit, but do not remember the characters that match this group.僅分組，不可回溯
\n		Match the same characters that were matched when group number n was first matched. Groups are subexpressions
		within (possibly nested) parentheses. Group numbers are assigned by counting left parentheses from left to right.
	Groups formed with (?: are not numbered.	
特殊的匹配位置：
	\b 匹配單詞邊界，即單詞(\w)與非單詞(\W)之間的界限,或者是一個字符串的首部或者尾部。【注：在字符分類的中括號[]中，\b表示回車】
		匹配一個單詞：java,其正則爲： /\bjava\b/
		注意：

console.log("abc".match(/\babc\b/));    //[ 'abc', index: 0, input: 'abc' ]  這個表示單詞首尾都有單詞邊界
console.log("abc,".match(/\babc,\B/));  //[ 'abc,', index: 0, input: 'abc,' ] 這個尾部不是單詞邊界，由於「，」逗號不屬於單詞範疇

	\B 匹配非單詞邊界，例如：匹配單詞中包含[Ss]cript,可是不是一個獨立的script單詞: /\B[Ss]cript\B/
	^  匹配字符串的起始
	$  匹配字符串的結尾
	(?=p)	正向向前斷言。要求隨後的字符必須匹配p，可是返回的字符不包括p

/[Jj]ava([Ss]cript)?(?=\:)/     //匹配「JavaScript: The Definitive Guide」中的「JavaScript」，可是不匹配「Java in a Nutshell」中的「Java」

	
	(?!p)	負向向前斷言。要求隨後的字符必須不匹配p，可是返回的字符不包括p

/Java(?!Script)([A-Z]\w*)/     //匹配Java，以後跟一個大寫字母，但Java不能直接連着Script。匹配:JavaBeans,JavaSrip。不匹配：                                 Javanese,JavaScript,JavaScripter

var text = '"hello world" javascript';
quote = /"([^"]*)"/g;
console.log(text.match(quote))      // [ '"hello world"' ] 返回值包括雙引號
quote = /"([^"]*)(?=")/g;
console.log(text.match(quote));     // [ '"hello world' ]  返回值不包括後面的雙引號
var quote = /"([^"]*)(?!")/g;
console.log(text.match(quote));     // [ '"hello worl', '" javascript' ]    匹配項爲以雙引號開始，不能以雙引號結尾，且不包括最後一個非雙引號匹配項

標誌
		m	多行模式，若是一個字符串中包含換行符，那麼啓用m標誌可匹配到。

var s='Java\nis fun';
console.log(s.replace(/java$/im,''));   // => 匹配 Java,輸出 is fun
console.log(s.replace(/java$/i,''));    //=>  無匹配項，輸出 Java\nis fun

	i	Case-insensitive.忽視大小寫。
	g	全局匹配，不開啓次標誌找到一個就再也不匹配。若開啓，則會找到全部的匹配項，
字符串的正則表達式方法
	1.string.search(regexp): 
				返回匹配的下標，若匹配失敗返回-1。
			 	若是傳遞的參數不是正則，將傳遞給RegExp建立一個正則對象。
			 	不支持全局匹配。標誌g將被忽略。

"JavaScript".search(/script/i)；    // => 4

	2.string.replace(searchRegExp,replaceStr):
				第一個參數能夠是字符串也能夠是正則。若爲字符串，將僅按字符串進行匹配，不會自動轉換爲正則。
				若是首個參數爲正則且包含標誌g，將替換全部匹配項，而非第一個匹配項。
				若首個參數有分組，在第二個參數能夠用$n進行引用。

var text = '"hello world" javascript';
var quote = /"([^"]*)"/g;
console.log(text.replace(quote, '「$1 lan」'));    //=>「hello world lan」 javascript

	3.string.match(regexp): 	
		返回一個包含匹配結果的數組，若參數爲字符串，則將其傳遞給RegExp構造函數，建立一個RegExp對象。
		此方法對g標誌有特殊的行爲：
		若開啓g標誌，返回的數組中是全部的匹配項

"1 plus 2 equals 3".match(/\d+/g) // returns ["1", "2", "3"]

		若無g標誌，首個元素爲匹配字符串，從第二個元素開始按順序匹配正則表達式中的分組子式。若用replace()來講明則result[1]對應$1,						以此類推。

var url = /(\w+):\/\/([\w.]+)\/(\S*)/;
var text = "Visit my blog at http://www.example.com/~david";
var result = text.match(url);
if (result != null) {
    var fullurl = result[0]; // Contains "http://www.example.com/~david"        //匹配結果
    var protocol = result[1]; // Contains "http"                            //分組子式1
    var host = result[2]; // Contains "www.example.com"                    //分組子式2
    var path = result[3]; // Contains "~david"                            //分組子式3
}

 向match方法傳遞非全局正則沒什麼意義，就像傳遞給exec()同樣，返回的數組中有下標，還有輸入屬性。

var quote = /"([^"]*)(?!")/;

var text = '"hello world" javascript';

var message = text.match(quote); console.log(message); //[""hello worl", "hello worl", index: 0, input: ""hello world" javascript"]
                       　　//[匹配項，分組子式(?!負向匹配不能回溯，所以不出如今結果裏)，匹配下標，輸入的字符串]

	
  4.string.split(regexp):

　　　　　　能夠接收字符串爲參數："123,456,789".split(","); // Returns ["123","456","789"]
        　亦可接收正則式爲參數："1, 2, 3, 4, 5".split(/\s*,\s*/); // Returns ["1","2","3","4","5"]，若split(',')=> [ '1', ' 2', ' 3', ' 4', ' 5' ]

Regexp Object
	因爲使用字符串做爲參數，所以在字面值正則中的\要變爲\\。
	包含5個屬性：
		1.source:只讀，包含正則表達式文本
		2.global:只讀，標誌g
		3.ignoreCase:只讀，標誌i
		4.multiline：只讀，標誌m
		5.lastIndex:可讀寫，開啓g標誌時，用於匹配下一次開始搜索的位置。
	Regexp對象的方法
	Regexp.exec(str):
		效果和String.match(regexp) 接近。exec()更像match()分解動做。
		不一樣點在於:
1.	無論標誌g是否開啓，exec()僅返回一個匹配項，而且提供關於該匹配項完整的信息，好比匹配的位置index和被用於搜索的字符串input [首個元素爲匹配字符串，從第二個元素開始按順序匹配正則表達式中的分組子式。]。
2.	當一個開啓g標誌的正則調用exec(),其lastIndex將當即被設置爲匹配項的後一個位置。同一個正則對象再次調用exec(),將從lastIndex搜索。若找不到匹配項，則lastIndex將被重置爲0.	若是使用同一個正則對象去搜索新的字符串，也能夠手動設置lastIndex=0.

var pattern = /Java/g;
var text = "JavaScript is more fun than Java!";
for (var result; (result = pattern.exec(text)) != null;) {
    console.log("Matched '" + result[0] + "' at position " + result.index + ";next search begins at " + pattern.lastIndex)
}
// Matched 'Java' at position 0;next search begins at 4 
// Matched 'Java' at position 28;next search begins at 32

  Regexp.test(str):
		 一個簡化版的exec(),只是返回true。同時也會改變lastIndex.在正則開啓g標誌時，下一次將從lastIndex進行搜索。
		字符串方法中,search(),replace(),match()都不會用到lastIndex屬性。

var regexp=/java/g;
var s='javascript is not java,it is more funny than java';
while(regexp.test(s))
        console.log('lastINdex: '+regexp.lastIndex)

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。