用正則表達式驗證郵件地址彷佛是一件簡單的事情,可是若是要完美的驗證一個合規的郵件地址,其實也許很複雜。郵件地址的規範來自於 RFC 5322 。 |
郵件地址的規範來自於 RFC 5322 。有一個網站 emailregex.com 專門列出各類編程語言下的驗證郵件地址的正則表達式,其中不少正則表達式都是我據說過而從未見過的複雜——我想說,作這個網站的程序員是被郵件驗證這件事傷害了多深啊!linux
其實,在產品環境中,通常來講並不須要這麼複雜的正則表達式來作到99.99%正確。通常來講,從執行效率和測試覆蓋率來講,只須要一個簡單的版本便可:程序員
/^[A-Z0-9._%+-]+@[A-Z0-9.-]+/.[A-Z]{2,4}$/i
那麼下面咱們來看看這些更嚴謹、更復雜的正則表達式吧:正則表達式
驗證郵件地址的通用正則表達式(符合 RFC 5322 標準)編程
(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:/.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[/x01-/x08/x0b/x0c/x0e-/x1f/x21/x23-/x5b/x5d-/x7f]|//[/x01-/x09/x0b/x0c/x0e-/x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?/.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|/[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)/.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[/x01-/x08/x0b/x0c/x0e-/x1f/x21-/x5a/x53-/x7f]|//[/x01-/x09/x0b/x0c/x0e-/x7f])+)/])
因爲各類語言對正則表達式的支持不一樣、語法差別和覆蓋率不一樣,因此,不一樣語言裏面的正則表達式也不一樣:dom
Python編程語言
這個是個簡單的版本:oop
r"(^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+/.[a-zA-Z0-9-.]+$)"
Javascript測試
這個有點複雜了:網站
/^[-a-z0-9~!$%^&*_=+}{/'?]+(/.[-a-z0-9~!$%^&*_=+}{/'?]+)*@([a-z0-9_][-a-z0-9_]*(/.[-a-z0-9_]+)*/.(aero|arpa|biz|com|coop|edu|gov|info|int|mil|museum|name|net|org|pro|travel|mobi|[a-z][a-z])|([0-9]{1,3}/.[0-9]{1,3}/.[0-9]{1,3}/.[0-9]{1,3}))(:[0-9]{1,5})?$/i
Swiftatom
[A-Z0-9a-z._%+-]+@[A-Za-z0-9.-]+//.[A-Za-z]{2,6}
PHP
PHP 的這個版本就更復雜了,覆蓋率就更大一些:
/^(?!(?:(?:/x22?/x5C[/x00-/x7E]/x22?)|(?:/x22?[^/x5C/x22]/x22?)){255,})(?!(?:(?:/x22?/x5C[/x00-/x7E]/x22?)|(?:/x22?[^/x5C/x22]/x22?)){65,}@)(?:(?:[/x21/x23-/x27/x2A/x2B/x2D/x2F-/x39/x3D/x3F/x5E-/x7E]+)|(?:/x22(?:[/x01-/x08/x0B/x0C/x0E-/x1F/x21/x23-/x5B/x5D-/x7F]|(?:/x5C[/x00-/x7F]))*/x22))(?:/.(?:(?:[/x21/x23-/x27/x2A/x2B/x2D/x2F-/x39/x3D/x3F/x5E-/x7E]+)|(?:/x22(?:[/x01-/x08/x0B/x0C/x0E-/x1F/x21/x23-/x5B/x5D-/x7F]|(?:/x5C[/x00-/x7F]))*/x22)))*@(?:(?:(?!.*[^.]{64,})(?:(?:(?:xn--)?[a-z0-9]+(?:-[a-z0-9]+)*/.){1,126}){1,}(?:(?:[a-z][a-z0-9]*)|(?:(?:xn--)[a-z0-9]+))(?:-[a-z0-9]+)*)|(?:/[(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){7})|(?:(?!(?:.*[a-f0-9][:/]]){7,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?)))|(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){5}:)|(?:(?!(?:.*[a-f0-9]:){5,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3}:)?)))?(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))(?:/.(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))){3}))/]))$/iD
Perl / Ruby
對與 PHP 的版本,Perl 和 Ruby 表示不服,能夠更嚴謹:
(?:(?:/r/n)?[ /t])*(?:(?:(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|"(?:[^/"/r//]|//.|(?:(?:/r/n)?[ /t]))*"(?:(?:/r/n)?[ /t])*)(?:/.(?:(?:/r/n)?[ /t])*(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|"(?:[^/"/r//]|//.|(?:(?:/r/n)?[/t]))*"(?:(?:/r/n)?[ /t])*))*@(?:(?:/r/n)?[ /t])*(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|/[([^/[/]/r//]|//.)*/](?:(?:/r/n)?[ /t])*)(?:/.(?:(?:/r/n)?[ /t])*(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|/[([^/[/]/r//]|//.)*/](?:(?:/r/n)?[ /t])*))*|(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|"(?:[^/"/r//]|//.|(?:(?:/r/n)?[ /t]))*"(?:(?:/r/n)?[ /t])*)*/<(?:(?:/r/n)?[ /t])*(?:@(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|/[([^/[/]/r//]|//.)*/](?:(?:/r/n)?[/t])*)(?:/.(?:(?:/r/n)?[ /t])*(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|/[([^/[/]/r//]|//.)*/](?:(?:/r/n)?[ /t])*))*(?:,@(?:(?:/r/n)?[ /t])*(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[/t])+|/Z|(?=[/["()<>@,;://"./[/]]))|/[([^/[/]/r//]|//.)*/](?:(?:/r/n)?[ /t])*)(?:/.(?:(?:/r/n)?[ /t])*(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|/[([^/[/]/r//]|//.)*/](?:(?:/r/n)?[ /t])*))*)*:(?:(?:/r/n)?[ /t])*)?(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|"(?:[^/"/r//]|//.|(?:(?:/r/n)?[ /t]))*"(?:(?:/r/n)?[ /t])*)(?:/.(?:(?:/r/n)?[ /t])*(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|"(?:[^/"/r//]|//.|(?:(?:/r/n)?[ /t]))*"(?:(?:/r/n)?[ /t])*))*@(?:(?:/r/n)?[ /t])*(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|/[([^/[/]/r//]|//.)*/](?:(?:/r/n)?[ /t])*)(?:/.(?:(?:/r/n)?[ /t])*(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|/[([^/[/]/r//]|//.)*/](?:(?:/r/n)?[ /t])*))*/>(?:(?:/r/n)?[ /t])*)|(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|"(?:[^/"/r//]|//.|(?:(?:/r/n)?[ /t]))*"(?:(?:/r/n)?[ /t])*)*:(?:(?:/r/n)?[ /t])*(?:(?:(?:[^()<>@,;://"./[/]/000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|"(?:[^/"/r//]|//.|(?:(?:/r/n)?[ /t]))*"(?:(?:/r/n)?[ /t])*)(?:/.(?:(?:/r/n)?[ /t])*(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|"(?:[^/"/r//]|//.|(?:(?:/r/n)?[ /t]))*"(?:(?:/r/n)?[ /t])*))*@(?:(?:/r/n)?[ /t])*(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|/[([^/[/]/r//]|//.)*/](?:(?:/r/n)?[ /t])*)(?:/.(?:(?:/r/n)?[ /t])*(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|/[([^/[/]/r//]|//.)*/](?:(?:/r/n)?[ /t])*))*|(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|"(?:[^/"/r//]|//.|(?:(?:/r/n)?[ /t]))*"(?:(?:/r/n)?[ /t])*)*/<(?:(?:/r/n)?[ /t])*(?:@(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|/[([^/[/]/r//]|//.)*/](?:(?:/r/n)?[ /t])*)(?:/.(?:(?:/r/n)?[ /t])*(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|/[([^/[/]/r//]|//.)*/](?:(?:/r/n)?[ /t])*))*(?:,@(?:(?:/r/n)?[ /t])*(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|/[([^/[/]/r//]|//.)*/](?:(?:/r/n)?[ /t])*)(?:/.(?:(?:/r/n)?[ /t])*(?:[^()<>@,;://"./[/]/000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|/[([^/[/]/r//]|//.)*/](?:(?:/r/n)?[ /t])*))*)*:(?:(?:/r/n)?[ /t])*)?(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|"(?:[^/"/r//]|//.|(?:(?:/r/n)?[ /t]))*"(?:(?:/r/n)?[ /t])*)(?:/.(?:(?:/r/n)?[ /t])*(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|"(?:[^/"/r//]|//.|(?:(?:/r/n)?[ /t]))*"(?:(?:/r/n)?[ /t])*))*@(?:(?:/r/n)?[ /t])*(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|/[([^/[/]/r//]|//.)*/](?:(?:/r/n)?[ /t])*)(?:/.(?:(?:/r/n)?[ /t])*(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|/[([^/[/]/r//]|//.)*/](?:(?:/r/n)?[ /t])*))*/>(?:(?:/r/n)?[ /t])*)(?:,/s*(?:(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|"(?:[^/"/r//]|//.|(?:(?:/r/n)?[ /t]))*"(?:(?:/r/n)?[ /t])*)(?:/.(?:(?:/r/n)?[ /t])*(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|"(?:[^/"/r//]|//.|(?:(?:/r/n)?[ /t]))*"(?:(?:/r/n)?[ /t])*))*@(?:(?:/r/n)?[ /t])*(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|/[([^/[/]/r//]|//.)*/](?:(?:/r/n)?[ /t])*)(?:/.(?:(?:/r/n)?[ /t])*(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|/[([^/[/]/r//]|//.)*/](?:(?:/r/n)?[ /t])*))*|(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|"(?:[^/"/r//]|//.|(?:(?:/r/n)?[ /t]))*"(?:(?:/r/n)?[ /t])*)*/<(?:(?:/r/n)?[ /t])*(?:@(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|/[([^/[/]/r//]|//.)*/](?:(?:/r/n)?[ /t])*)(?:/.(?:(?:/r/n)?[ /t])*(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|/[([^/[/]/r//]|//.)*/](?:(?:/r/n)?[ /t])*))*(?:,@(?:(?:/r/n)?[/t])*(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|/[([^/[/]/r//]|//.)*/](?:(?:/r/n)?[ /t])*)(?:/.(?:(?:/r/n)?[ /t])*(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|/[([^/[/]/r//]|//.)*/](?:(?:/r/n)?[ /t])*))*)*:(?:(?:/r/n)?[ /t])*)?(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|"(?:[^/"/r//]|//.|(?:(?:/r/n)?[ /t]))*"(?:(?:/r/n)?[ /t])*)(?:/.(?:(?:/r/n)?[ /t])*(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|"(?:[^/"/r//]|//.|(?:(?:/r/n)?[ /t]))*"(?:(?:/r/n)?[ /t])*))*@(?:(?:/r/n)?[ /t])*(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|/[([^/[/]/r//]|//.)*/](?:(?:/r/n)?[ /t])*)(?:/.(?:(?:/r/n)?[ /t])*(?:[^()<>@,;://"./[/] /000-/031]+(?:(?:(?:/r/n)?[ /t])+|/Z|(?=[/["()<>@,;://"./[/]]))|/[([^/[/]/r//]|//.)*/](?:(?:/r/n)?[ /t])*))*/>(?:(?:/r/n)?[ /t])*))*)?;/s
Perl 5.10 及之後版本
上面的版本,嗯,我能夠說是天書嗎?反正我是沒有解讀的想法了。固然,新版本的 Perl 語言還有一個更易讀的版本(你是說真的麼?)
/(?(DEFINE) (?<address> (?&mailbox) | (?&group)) (?<mailbox> (?&name_addr) | (?&addr_spec)) (?<name_addr> (?&display_name)? (?&angle_addr)) (?<angle_addr> (?&CFWS)? < (?&addr_spec) > (?&CFWS)?) (?<group> (?&display_name) : (?:(?&mailbox_list) | (?&CFWS))? ; (?&CFWS)?) (?<display_name> (?&phrase)) (?<mailbox_list> (?&mailbox) (?: , (?&mailbox))*) (?<addr_spec> (?&local_part) /@ (?&domain)) (?<local_part> (?&dot_atom) | (?"ed_string)) (?<domain> (?&dot_atom) | (?&domain_literal)) (?<domain_literal> (?&CFWS)? /[ (?: (?&FWS)? (?&dcontent))* (?&FWS)? /] (?&CFWS)?) (?<dcontent> (?&dtext) | (?"ed_pair)) (?<dtext> (?&NO_WS_CTL) | [/x21-/x5a/x5e-/x7e]) (?<atext> (?&ALPHA) | (?&DIGIT) | [!#/$%&'*+-/=?^_`{|}~]) (?<atom> (?&CFWS)? (?&atext)+ (?&CFWS)?) (?<dot_atom> (?&CFWS)? (?&dot_atom_text) (?&CFWS)?) (?<dot_atom_text> (?&atext)+ (?: /. (?&atext)+)*) (?<text> [/x01-/x09/x0b/x0c/x0e-/x7f]) (?<quoted_pair> // (?&text)) (?<qtext> (?&NO_WS_CTL) | [/x21/x23-/x5b/x5d-/x7e]) (?<qcontent> (?&qtext) | (?"ed_pair)) (?<quoted_string> (?&CFWS)? (?&DQUOTE) (?:(?&FWS)? (?&qcontent))* (?&FWS)? (?&DQUOTE) (?&CFWS)?) (?<word> (?&atom) | (?"ed_string)) (?<phrase> (?&word)+) # Folding white space (?<FWS> (?: (?&WSP)* (?&CRLF))? (?&WSP)+) (?<ctext> (?&NO_WS_CTL) | [/x21-/x27/x2a-/x5b/x5d-/x7e]) (?<ccontent> (?&ctext) | (?"ed_pair) | (?&comment)) (?<comment> /( (?: (?&FWS)? (?&ccontent))* (?&FWS)? /) ) (?<CFWS> (?: (?&FWS)? (?&comment))* (?: (?:(?&FWS)? (?&comment)) | (?&FWS))) # No whitespace control (?<NO_WS_CTL> [/x01-/x08/x0b/x0c/x0e-/x1f/x7f]) (?<ALPHA> [A-Za-z]) (?<DIGIT> [0-9]) (?<CRLF> /x0d /x0a) (?<DQUOTE> ") (?<WSP> [/x20/x09]) ) (?&address)/x
Ruby (簡單版)
Ruby 表示,其實人家還有個簡單版本:
//A([/w+/-].?)+@[a-z/d/-]+(/.[a-z]+)*/.[a-z]+/z/i
.NET
這樣的版本誰沒有啊——.NET 說:
^/w+([-+.']/w+)*@/w+([-.]/w+)*/./w+([-.]/w+)*$
grep 命令
用 grep 命令在文件中查找郵件地址,我想你不會寫個若干行的正則表達式吧,意思一下就好了:
$ grep -E -o "/b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+/.[A-Za-z]{2,6}/b" filename.txt
SQL Server
在 SQL Server 中也是能夠用正則表達式的,不過這個代碼片斷應該是來自某個產品環境中的,因此,還體貼的照顧了那些把郵件地址寫錯的人:
select email from table_name where patindex ('%[ &'',":;!+=//()<>]%', email) > 0 -- Invalid characters or patindex ('[@.-_]%', email) > 0 -- Valid but cannot be starting character or patindex ('%[@.-_]', email) > 0 -- Valid but cannot be ending character or email not like '%@%.%' -- Must contain at least one @ and one . or email like '%..%' -- Cannot have two periods in a row or email like '%@%@%' -- Cannot have two @ anywhere or email like '%.@%' or email like '%@.%' -- Cannot have @ and . next to each other or email like '%.cm' or email like '%.co' -- Camaroon or Colombia? Typos. or email like '%.or' or email like '%.ne' -- Missing last letter
Oracle PL/SQL
這個是否是有點偷懶?尤爲是在那些「複雜」的正則表達式以後:
SELECT email FROM table_name WHERE REGEXP_LIKE (email, '[A-Z0-9._%-]+@[A-Z0-9._%-]+/.[A-Z]{2,4}');
MySQL
好吧,看來最後也同樣懶:
SELECT * FROM `users` WHERE `email` NOT REGEXP '^[A-Z0-9._%-]+@[A-Z0-9.-]+/.[A-Z]{2,4}$';
那麼,你有沒有關於驗證郵件地址的正則表達式分享給你們?