RegEx Pattern Library | PCRE2 Compliant Regular Expressions

This collection compiles commonly used regular expressions (RegEx) frequently employed in software development. Designed for quick access, it saves valuable time and boosts programming efficiency. Each expression has undergone rigorous testing and is continuously updated. Since regular expressions may vary slightly across different programs or tools, they can be easily modified as needed.

Note	Regular expression
URL	[a-zA-z]+://[^\s]*
IP	((2[0-4]\d\|25[0-5]\|[01]?\d\d?)\.){3}(2[0-4]\d\|25[0-5]\|[01]?\d\d?)
Email	\w+([-+.]\w+)@\w+([-.]\w+)\.\w+([-.]\w+)*
QQ	[1-9]\d{4,}
HTML tags (containing content or self-closing)	<(.)(.)>.<\/\1>\|<(.) \/>
Password (must contain at least one digit, one uppercase letter, one lowercase letter, and one punctuation mark; minimum 8 characters)	(?=^.{8,}$)(?=.\d)(?=.\W+)(?=.[A-Z])(?=.[a-z])(?!.\n).$
Date (Year-Month-Day)	(\d{4}\|\d{2})-((1[0-2])\|(0?[1-9]))-(([12][0-9])\|(3[01])\|(0?[1-9]))
Date (Month/Day/Year)	((1[0-2])\|(0?[1-9]))/(([12][0-9])\|(3[01])\|(0?[1-9]))/(\d{4}\|\d{2})
Time (Hours:Minutes, 24-hour format)	((1\|0?)[0-9]\|2[0-3]):([0-5][0-9])
Chinese characters (characters)	[\u4e00-\u9fa5]
Chinese characters and full-width symbols (characters)	[\u3000-\u301e\ufe10-\ufe19\ufe30-\ufe44\ufe50-\ufe6b\uff01-\uffee]
landline number(86)	(\d{4}-\|\d{3}-)?(\d{8}\|\d{7})
Mobile number(86)	1\d{10}
Mainland China Postal Codes	[1-9]\d{5}
Mainland China ID Number (15-digit or 18-digit)	\d{15}(\d\d[0-9xX])?
Non-negative integers (positive integers or zero)	\d+
positive integer	[0-9][1-9][0-9]
negative integer	-[0-9][1-9][0-9]
integer	-?\d+
decimal	(-?\d+)(\.\d+)?
Words not containing abc	\b((?!abc)\w)+\b

Regular expressionUseful and efficient for string processing, form validation, and similar scenarios. Common expressions are compiled here for future reference.

Note	Regular expression
Username	/^[a-z0-9_-]{3,16}$/
Password	/^[a-z0-9_-]{6,18}$/
Hexadecimal value	/^#?([a-f0-9]{6}\|[a-f0-9]{3})$/
Email address	/^([a-z0-9_\.-]+)@([\da-z\.-]+)\.([a-z\.]{2,6})$/
URL	/^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-])\/?$/
IP	/^(?:(?:25[0-5]\|2[0-4][0-9]\|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]\|2[0-4][0-9]\|[01]?[0-9][0-9]?)$/
HTML tags	/^<([a-z]+)([^<]+)(?:>(.)<\/\1>\|\s+\/>)$/
Range of Chinese Characters in Unicode Encoding	/^[u4e00-u9fa5],{0,}$/
Regular expressions for matching Chinese characters	[\u4e00-\u9fa5]
Note: Matching Chinese characters is such a headache, but with this expression, it's a breeze.
Match double-byte characters (including Chinese characters)	[^\x00-\xff]
Note: Can be used to calculate the length of a string (counting double-byte characters as 2 and ASCII characters as 1).
Regular expression matching blank lines	\n\s*\r
Note: Can be used to remove blank lines.
Regular expressions for matching HTML tags	<(\S?)[^>]>.?</\1>\|<.?/>
Note: The versions circulating online are too poor. The one above can only partially match them and remains ineffective for complex nested markup.
Regular expression matching leading and trailing whitespace characters	^\s\|\s$
Note: This expression can be used to remove whitespace characters at the beginning and end of a line (including spaces, tabs, form feeds, etc.), making it a highly useful tool.
Regular expression for matching email addresses	\w+([-+.]\w+)@\w+([-.]\w+)\.\w+([-.]\w+)*
Note: Very useful for form validation.
Regular expression for matching URLs	[a-zA-z]+://[^\s]*
Note: The online versions have very limited functionality, but the one above should meet most needs.
Verify account legitimacy (must start with a letter, 5-16 bytes allowed, alphanumeric characters and underscores permitted)	^[a-zA-Z][a-zA-Z0-9_]{4,15}$
Note: Very useful for form validation.
Match phone numbers(86)	\d{3}-\d{8}\|\d{4}-\d{7}
Note: Matching formats such as 0511-4405222 or 021-87888822
Match Tencent QQ number	[1-9][0-9]{4,}
Note: Tencent QQ numbers start from 10000.
Match Mainland China Postal Codes	[1-9]\d{5}(?!\d)
Note: Postal codes in mainland China consist of 6 digits.
Match ID card	\d{15}\|\d{18}
Note: Mainland China's ID cards consist of 15 or 18 digits.
Match IP address	\d+\.\d+\.\d+\.\d+
Note: Useful when extracting IP addresses.
Match specific numbers:
^[1-9]\d*$	// Match positive integers
^-[1-9]\d*$	//Match negative integers
^-?[1-9]\d*$	//Match integers
^[1-9]\d*\|0$	// Match non-negative integers (positive integers + 0)
^-[1-9]\d*\|0$	// Match non-positive integers (negative integers +0)
^[1-9]\d\.\d\|0\.\d[1-9]\d$	// Matches positive floating-point numbers
^-([1-9]\d\.\d\|0\.\d[1-9]\d)$	//Match negative floating-point numbers
^-?([1-9]\d\.\d\|0\.\d[1-9]\d\|0?\.0+\|0)$	//Match floating-point numbers
^[1-9]\d\.\d\|0\.\d[1-9]\d\|0?\.0+\|0$	// Match non-negative floating-point numbers (positive floating-point numbers +0)
^(-([1-9]\d\.\d\|0\.\d[1-9]\d))\|0?\.0+\|0$	// Match non-positive floating-point numbers (negative floating-point numbers +0)
Note: Useful when handling large amounts of data; be sure to adjust for specific applications.
Match a specific string
^[A-Za-z]+$	// Match strings composed of the 26 letters of the English alphabet
^[A-Z]+$	// Match strings composed of uppercase letters from the 26 letters of the English alphabet
^[a-z]+$	// Match strings composed of lowercase letters from the 26 letters of the English alphabet
^[A-Za-z0-9]+$	// Match strings composed of digits and the 26 letters of the English alphabet
^\w+$	// Matches strings composed of digits, the 26 letters of the English alphabet, or underscores.

Complete Collection of Regular Expressions:Regular expressions come in various styles. The following table provides a comprehensive list of metacharacters in PCRE and their behavior within regular expression contexts:

Character	Description
\	Mark the next character as a special character, a literal character, a backreference, or an octal escape sequence. For example, “n” matches the character ‘n’. “\n” matches a newline character. The sequence “\\” matches “\”, while “\(” matches “(”.
^	Matches the start of the input string. If the Multiline property of the RegExp object is set, ^ also matches positions following “\n” or “\r”.
$	Matches the end position of the input string. If the Multiline property of the RegExp object is set, $ also matches the position before “\n” or “\r”.
*	Matches the preceding subexpression zero or more times. For example, `zo` matches “z” and “zoo”. `` is equivalent to `{0,}`.
+	Matches the preceding subexpression one or more times. For example, “zo+” matches “zo” and ‘zoo’, but not “z”. The + symbol is equivalent to {1,}.
?	Matches the preceding subexpression zero or one time. For example, “do(es)?” matches “do” in ‘do’ or “does”. ? is equivalent to {0,1}.
{n}	n is a non-negative integer. Matches occur n times. For example, “o{2}” does not match the “o” in ‘Bob’, but it matches the two o's in “food”.
{n,}	n is a non-negative integer. Match at least n times. For example, `o{2,}` cannot match the ‘o’ in “Bob”, but it can match all 'o's in “foooood”. `o{1,}` is equivalent to `o+`. `o{0,}` is equivalent to `o*`.
{n,m}	m and n are both non-negative integers, where n<=m. Matches at least n times and at most m times. For example, “o{1,3}” will match the first three o's in ‘fooooood’. “o{0,1}” is equivalent to “o?”. Note that there should be no space between the comma and the two numbers.
?	When this character immediately follows any other quantifier (*, +, ?, {n}, {n,}, {n,m}), the matching pattern is non-greedy. Non-greedy mode matches the least possible number of the search string, while the default greedy mode matches the maximum possible number. For example, for the string “oooo”, `o+?` matches a single ‘o’, while `o+` matches all “o”s.
.	Match any single character except “\n”. To match any character including “\n”, use a pattern like “[.\n]”.
(pattern)	Match the pattern and retrieve this match. The retrieved match can be obtained from the generated Matches collection. In VBScript, use the SubMatches collection; in JScript, use the $0…$9 properties. To match parentheses characters, use “$” or “$”.
(?:pattern)	Matches the pattern but does not capture the match result, meaning this is a non-capturing match that is not stored for later use. This is useful when using the character “(\|)” to combine different parts of a pattern. For example, “industr(?:y\|ies)” is a more concise expression than “industry\|industries”.
(?=pattern)	Positive lookahead matches the search string at the beginning of any string matching the pattern. This is a non-capturing match, meaning the match does not need to be captured for later use. For example, “Windows(?=95\|98\|NT\|2000)” matches “Windows” in “Windows2000” but does not match ‘Windows’ in “Windows3.1”. Lookahead does not consume characters. This means that after a match occurs, the search for the next match begins immediately after the last match, rather than starting after the character containing the lookahead.
(?!pattern)	Negative lookahead matches the search string at the start of any string that does not match the pattern. This is a non-capturing match, meaning the match does not need to be captured for later use. For example, “Windows(?!95\|98\|NT\|2000)” matches “Windows” in “Windows3.1” but does not match ‘Windows’ in “Windows2000”. Negative lookahead does not consume characters. This means that after a match occurs, the search for the next match begins immediately after the last match, rather than starting after the character containing the lookahead.
x\|y	Matches either x or y. For example, “z\|food” matches either “z” or “food”. “(z\|f)ood” matches either ‘zood’ or “food”.
[xyz]	Character set. Matches any character contained within it. For example, “[abc]” can match the ‘a’ in “plain”.
[^xyz]	Negative character set. Matches any character not included. For example, “[^abc]” can match the ‘p’ in “plain”.
[a-z]	Character range. Matches any character within the specified range. For example, “[a-z]” matches any lowercase letter character within the range from ‘a’ to “z”.
[^a-z]	Negative character set. Matches any character not within the specified range. For example, “[^a-z]” matches any character not within the range from ‘a’ to ‘z’.
\b	Matches a word boundary, which refers to the position between a word and a space. For example, “er\b” can match “er” in “never,” but cannot match ‘er’ in “verb.”
\B	Matches non-word boundaries. “er\B” matches the “er” in “verb” but does not match the ‘er’ in “never”.
\cx	Matches the control character specified by x. For example, \cM matches a Control-M or carriage return character. The value of x must be one of A-Z or a-z. Otherwise, c is treated as a literal “c” character.
\d	Matches a numeric character. Equivalent to [0-9].
\D	Matches a non-numeric character. Equivalent to [^0-9].
\f	Matches a line break. Equivalent to \x0c and \cL.
\n	Matches a line break. Equivalent to \x0a and \cJ.
\r	Matches a carriage return character. Equivalent to \x0d and \cM.
\s	Matches any whitespace character, including spaces, tabs, form feeds, etc. Equivalent to [\f\n\r\t\v].
\S	Matches any non-whitespace character. Equivalent to [^\f\n\r\t\v].
\t	Matches a tab character. Equivalent to \x09 and \cI.
\v	Matches a vertical tab. Equivalent to \x0b and \cK.
\w	Matches any word character including underscores. Equivalent to “[A-Za-z0-9_]”.
\W	Matches any non-word character. Equivalent to “[^A-Za-z0-9_]”.
\xn	Match n, where n is a hexadecimal escape value. Hexadecimal escape values must be a fixed length of two digits. For example, “\x41” matches ‘A’. “\x041” is equivalent to “\x04&1”. ASCII encoding can be used in regular expressions.
\num	Matches num, where num is a positive integer. A reference to the captured match. For example, “(.)\1” matches two consecutive identical characters.
\n	Indicates an octal escape value or a backward reference. If at least n captured subexpressions precede \n, then n is a backward reference. Otherwise, if n is an octal digit (0-7), then n is an octal escape value.
\nm	Indicates an octal escape value or a backreference. If \nm is preceded by at least nm acquisitions, nm is a backreference. If \nm is preceded by at least n acquisitions, n is a backreference followed by the literal m. If neither preceding condition holds, and both n and m are octal digits (0-7), \nm matches the octal escape value nm.
\nml	If n is an octal digit (0-3) and both m and l are octal digits (0-7), then it matches the octal escape value nml.
\un	Matches n, where n is a Unicode character represented by four hexadecimal digits. For example, \u00A9 matches the copyright symbol (©).

Developer Tools

JSON

Formatter

Enc/Dec

Text/Numeric

Network

Webmaster

Calculation

Others

Reference