Regex Pattern Laboratory: PCRE2 Syntax Explorer & Debugger

Regular expressions, also known as pattern expressions, are text patterns commonly used to search, replace, and manipulate text. They primarily consist of letters from a to z along with several special metacharacters. Regular expressions have an extremely broad scope of application. Initially popularized by Unix, they are now widely used in languages such as Scala, PHP, C#, Java, C++, Objective-C, Perl, Swift, VBScript, JavaScript, Ruby, and Python. Learning regular expressions is essentially learning a highly flexible logical approach to achieving string manipulation through simple and efficient methods.

Regular characters	Description
\	Mark the next character as a special character, a literal character, a backreference, or an octal escape sequence. For example, “`n`” matches the character "`n`". “`\n`” matches a newline character. The sequence “`\\`” matches “`\`”, while “`\(`” matches “`(”`
^	Matches the start of the input string. If the Multiline property of the RegExp object is set, ^ also matches positions following “`\n`” or “`\r`”.
$	Matches the end position of the input string. If the Multiline property of the RegExp object is set, $ also matches the position before “`\n`” or “`\r`”.
*	Matches the preceding subexpression zero or more times. For example, `zo` matches “`z`” and “`zoo`”. `` is equivalent to `{0,}`.
+	Matches the preceding subexpression one or more times. For example, “`zo+`” matches “`zo`” and ‘zoo’, but not “`z`”. The + symbol is equivalent to {1,}.
?	Matches the preceding subexpression zero or one time. For example, “`do(es)?`” can match “`does`” or the ‘do’ in “`does`”. ? is equivalent to {0,1}.
{n}	n is a non-negative integer. Matches occur n times. For example, “`o{2}`” cannot match the “`o`” in ‘`Bob`’, but it can match the two o's in “`food`”.
{n,}	n is a non-negative integer. Match at least n times. For example, “o{2,}” cannot match the “o” in “Bob”, but it can match all “o”s in ‘foooood’. “o{1,}” is equivalent to “o+”. “o{0,}” is equivalent to “o*”.
{n,m}	m and n are both non-negative integers, where n <= m. Match at least n times and at most m times. For example, “o{1,3}” will match the first three o's in ‘fooooood’. “o{0,1}” is equivalent to “o?”. Note that there should be no space between the comma and the two numbers.
?	When this character immediately follows any other quantifier (`*,+,?,{n}`, `{n,}`, `{n,m}`), the matching pattern is non-greedy. Non-greedy mode matches the least possible number of occurrences of the search string, while the default greedy mode matches the maximum possible number. For example, for the string “oooo”, “o+?” matches a single ‘o’, while “o+” matches all occurrences of “o”.
.	Match any single character except “\n”. To match any character including “\n”, use a pattern like “(.\|\n)”.
(pattern)	Match the pattern and retrieve this match. The retrieved match can be obtained from the generated Matches collection. In VBScript, use the SubMatches collection; in JScript, use the $0…$9 properties. To match parentheses characters, use “$” or “$”.
(?=pattern)	Positive lookahead matches the search string at the beginning of any string matching the pattern. This is a non-capturing match, meaning the match does not need to be captured for later use. For example, “Windows(?=95\|98\|NT\|2000)” matches “Windows” in “Windows2000” but does not match ‘Windows’ in “Windows3.1”. The lookahead does not consume characters. This means that after a match occurs, the search for the next match begins immediately after the last match, rather than starting after the character containing the lookahead.
(?!pattern)	Positive lookahead negation matches the search string at the start of any string that does not match the pattern. This is a non-capturing match, meaning the match does not need to be captured for later use. For example, “Windows(?!95\|98\|NT\|2000)” matches “Windows” in “Windows3.1” but does not match ‘Windows’ in “Windows2000”. The lookahead does not consume characters. This means that after a match occurs, the search for the next match begins immediately after the last match, rather than starting after the character containing the lookahead.
(?<=pattern)	Negative lookahead is similar to positive lookahead, but operates in the opposite direction. For example, “(?<=95\|98\|NT\|2000)Windows” will match “Windows” in “2000Windows” but will not match ‘Windows’ in “3.1Windows”.
(?<!pattern)	Reverse negative lookahead, analogous to positive negative lookahead, operates in the opposite direction. For example, “(?
x\|y	Matches either x or y. For example, “z\|food” matches either “z” or “food.” “(z\|f)ood” matches either ‘zood’ or “food.”
[xyz]	Character set. Matches any character contained within it. For example, “[abc]” can match the ‘a’ in “plain”.
[^xyz]	Negative character set. Matches any character not included. For example, “[^abc]” can match the ‘p’ in “plain”.
[a-z]	Character range. Matches any character within the specified range. For example, “[a-z]” matches any lowercase letter character within the range from ‘a’ to “z”.
[^a-z]	Negative character set. Matches any character not within the specified range. For example, “[^a-z]” matches any character not within the range from ‘a’ to ‘z’.
\b	Matches a word boundary, which refers to the position between a word and a space. For example, “er\b” can match “er” in “never” but cannot match ‘er’ in “verb”.
\B	Matches non-word boundaries. “er\B” matches “er” in “verb” but not ‘er’ in “never”.
\cx	Matches the control character specified by x. For example, \cM matches a Control-M or carriage return character. The value of x must be one of A-Z or a-z. Otherwise, c is treated as a literal “c” character.
\d	Matches a numeric character. Equivalent to [0-9].
\D	Matches a non-numeric character. Equivalent to [^0-9].
\f	Matches a line break. Equivalent to \x0c and \cL.
\n	Matches a line break. Equivalent to \x0a and \cJ.
\r	Matches a carriage return character. Equivalent to \x0d and \cM.
\s	Matches any whitespace character, including spaces, tabs, form feeds, etc. Equivalent to [ \f\n\r\t\v].
\S	Matches any non-whitespace character. Equivalent to [^ \f\n\r\t\v].
\t	Matches a tab character. Equivalent to \x09 and \cI.
\v	Matches a vertical tab. Equivalent to \x0b and \cK.
\w	Matches any word character including underscores. Equivalent to “[A-Za-z0-9_]”.
\W	Matches any non-word character. Equivalent to “[^A-Za-z0-9_]”.
\xn	Matches n, where n is a hexadecimal escape value. Hexadecimal escape values must be a fixed length of two digits. For example, “\x41” matches ‘A’. “\x041” is equivalent to “\x04&1”. ASCII encoding can be used in regular expressions.
\num	Matches num, where num is a positive integer. A reference to the captured match. For example, “(.)\1” matches two consecutive identical characters.
\n	Indicates an octal escape value or a backward reference. If at least n captured subexpressions precede \n, then n is a backward reference. Otherwise, if n is an octal digit (0-7), then n is an octal escape value.
\nm	Indicates an octal escape value or a backreference. If \nm is preceded by at least nm acquisitions, nm is a backreference. If \nm is preceded by at least n acquisitions, n is a backreference followed by the literal m. If neither preceding condition holds, and both n and m are octal digits (0-7), \nm matches the octal escape value nm.
\nml	If n is an octal digit (0-3) and both m and l are octal digits (0-7), then match the octal escape value nml.
\un	Matches n, where n is a Unicode character represented by four hexadecimal digits. For example, \u00A9 matches the copyright symbol (©).

Common Regular Expressions

Username	/^[a-z0-9_-]{3,16}$/
Password	/^[a-z0-9_-]{6,18}$/
Password2	(?=^.{8,}$)(?=.\d)(?=.\W+)(?=.[A-Z])(?=.[a-z])(?!.\n).$ (Composed of numbers, uppercase letters, lowercase letters, and punctuation marks, with all four types required, and at least 8 characters)
Hexadecimal value	/^#?([a-f0-9]{6}\|[a-f0-9]{3})$/
Email address	/^([a-z0-9_\.-]+)@([\da-z\.-]+)\.([a-z\.]{2,6})$/ /^[a-z\d]+(\.[a-z\d]+)@([\da-z](-[\da-z])?)+(\.{1,2}[a-z]+)+$/ or\w+([-+.]\w+)@\w+([-.]\w+)\.\w+([-.]\w+)
URL	/^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-])\/?$/ or [a-zA-z]+://[^\s]*
IP	/((2[0-4]\d\|25[0-5]\|[01]?\d\d?)\.){3}(2[0-4]\d\|25[0-5]\|[01]?\d\d?)/ /^(?:(?:25[0-5]\|2[0-4][0-9]\|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]\|2[0-4][0-9]\|[01]?[0-9][0-9]?)$/ or ((2[0-4]\d\|25[0-5]\|[01]?\d\d?)\.){3}(2[0-4]\d\|25[0-5]\|[01]?\d\d?)
HTML tags	/^<([a-z]+)([^<]+)(?:>(.)<\/\1>\|\s+\/>)$/ or<(.)(.)>.<\/\1>\|<(.) \/>
Remove code \\comments	(?<!http:\|\S)//.*$
Match double-byte characters (including Chinese characters)	[^\x00-\xff]
Chinese characters (characters)	[\u4e00-\u9fa5]
Range of Chinese Characters in Unicode Encoding	/^[\u2E80-\u9FFF]+$/
Chinese characters and full-width punctuation marks (characters)	[\u3000-\u301e\ufe10-\ufe19\ufe30-\ufe44\ufe50-\ufe6b\uff01-\uffee]
Date (Year-Month-Day)	(\d{4}\|\d{2})-((0?([1-9]))\|(1[1\|2]))-((0?[1-9])\|([12]([1-9]))\|(3[0\|1]))
Date (Month/Day/Year)	((0?[1-9]{1})\|(1[1\|2]))/(0?[1-9]\|([12][1-9])\|(3[0\|1]))/(\d{4}\|\d{2})
Time (Hours:Minutes, 24-hour format)	((1\|0?)[0-9]\|2[0-3]):([0-5][0-9])
Mainland China landline telephone number	(\d{4}-\|\d{3}-)?(\d{8}\|\d{7})
mobile number(86)	1\d{10}
Mainland China Postal Codes	[1-9]\d{5}
Mainland China ID Number (15-digit or 18-digit)	\d{15}(\d\d[0-9xX])?
Non-negative integers (positive integers or zero)	\d+
positive integer	[0-9][1-9][0-9]
negative integer	-[0-9][1-9][0-9]
integer	-?\d+
decimal	(-?\d+)(\.\d+)?
Blank line	\n\s\r or \n\n(editplus) or ^[\s\S ]\n
QQ	[1-9]\d{4,}
Words not containing abc	\b((?!abc)\w)+\b
Match leading and trailing whitespace characters	^\s\|\s$
Frequently Used Editing Tools	The following are replacements for certain special Chinese characters:(editplus) ^[0-9].\n ^[^第].\n ^[习题].\n ^[\s\S ]\n ^[0-9]\. ^[\s\S ]\n <p[^<>]> href="javascript:if$confirm\('(.?)'$\)window\.location='(.?)'" <span style=".[^"]rgb$255,255,255$">.[^<>]</span> <DIV class=xs0>[\s\S]?</DIV>

Regular Expression Syntax

Regular Expression Syntax provides a quick reference guide for commonly used regular expressions, syntax lookup, essential syntax elements, basic structure, sub-expression rules, modifiers, greedy matching, and non-greedy matching—enabling efficient string manipulation through straightforward methods.

Regular Expression Syntax

Developer Tools

JSON

Formatter

Enc/Dec

Text/Numeric

Network

Webmaster

Calculation

Others

Reference