Regular Expression Syntax Cheat Sheet
Regular expressions (regex or regexp) are a pattern of characters that describe an amount of text.
Anchors
Anchors match a position before or after other characters.
Syntax | Description | Example pattern | Example matches | Example non-matches |
---|---|---|---|---|
^ | match start of line | ^r | r abbitr accoon | parrot ferret |
$ | match end of line | t$ | rabbit foo t | trap star |
\A | match start of line | \Ar | r abbitr accoon | parrot ferret |
\Z | match end of line | t\Z | rabbit foo t | trap star |
\b | match characters at the start or end of a word | \bfox\b | the red fox ranthe fox ate | foxtrot foxskin scarf |
\B | match characters in the middle of other non-space characters | \Bee\B | tree sb ee f | bee tree |
Matching types of character
Rather than matching specific characters, you can match specific types of characters such as letters, numbers, and more.
Syntax | Description | Example pattern | Example matches | Example non-matches |
---|---|---|---|---|
. | Anything except for a linebreak | c.e | cle anche ap | acert cent |
\d | match a digit | \d | 6060 -842 2 b| ^2 b | two **___ |
\D | Match a non-digit | \D | The 5 cats ate 12 Angry men | 52 10032 |
\w | Match word characters | \wee\w | trees b ee4 | The bee eels eat meat |
\W | Match non-word characters | \Wbat\W | At bat Swing the bat fast | wombat bat53 |
\s | Match whitespace | \sfox\s | the fox atehis fox ran | it’s the fox. foxfur |
\S | Match non-whitespace | \See\S | trees beef | the bee stung The tall tree |
\metacharacter | Escape a metacharacter to match on the metacharacter | \. \^ | The cat ate. 2 ^ 3 | the cat ate 23 |
Character classes
Character classes are sets or ranges of characters.
Syntax | Description | Example pattern | Example matches | Example non-matches |
---|---|---|---|---|
[xy] | match several characters | gr[ea]y | gray grey | green greek |
[x-y] | match a range of characters | [a-e] | a mbe rb ra nd | fox join |
[^xy] | Does not match several characters | gr[^ea]y | green greek | gray grey |
[\^-] | match metacharacters inside the character class | 4[\^\.-+*/]\d | 4^3 4.2 | 44 23 |
Repetition
Rather than matching single instances of characters, you can match repeated characters.
Syntax | Description | Example pattern | Example matches | Example non-matches |
---|---|---|---|---|
x* | match zero or more times | ar*o | cacao c arro t | arugula artichoke |
x+ | match one or more times | re+ | gree nt ree | trap ruined |
x? | Match zero or one times | ro?a | roa stra nt | root rear |
x{m} | match m times | \we{2}\w | deer seer | red enter |
x{m,} | match m or more times | 2{3,}4 | 671-2224 2222224 | 224 123 |
x{m,n} | match between m and n times | 12{1,3}3 | 123 412223 84 | 15335 1222223 |
x*?, x+?, etc. | match the minimum number of times - known as a lazy quantifier | re+? | tre ef re eeee | trout roasted |
Capturing, alternation & backreferences
In order to extract specific parts of a string, you can capture those parts, and even name the parts that you captured.
Syntax | Description | Example pattern | Example matches | Example non-matches |
---|---|---|---|---|
(x) | capturing a pattern | (iss)+ | Mississi ppimi ss ed | mist persist |
(?:x) | create a group without capturing | (?:ab)(cd) | Match:abcd Group 1: cd | acbd |
(?<name>x) | create a named capture group | (?<first>\d)(?<scrond>\d)\d* | Match: 1325 first: 1 second: 3 | 2 hello |
(x|y) | match several alternative patterns | (re|ba) | re dba nter | rant bear |
\n | reference previous captures where n is the group index starting at 1 | (b)(\w*)\1 | blob brib e | bear bring |
\k<name> | reference named captures | (?<first>5)(\d*)\k<first> | 51245 55 | 523 51 |
Lookahead
You can specify that specific characters must appear before or after you match, without including those characters in the match.
Syntax | Description | Example pattern | Example matches | Example non-matches |
---|---|---|---|---|
(?=x) | looks ahead at the next characters without using them in the match | an(?=an) iss(?=ipp) | ban anaMiss issi ppi | band missed |
(?!x) | looks ahead at next characters to not match on | ai(?!n) | fai lbr ai l | faint train |
(?<=x) | looks at previous characters for a match without using those in the match | (?<=tr)a | tra iltr a nslate | bear streak |
(?<!x) | looks at previous characters to not match on | (?!tr)a | bea rtr a nslate | trail strained |
Literal matches and modifiers
Modifiers are settings that change the way the matching rules work.
Syntax | Description | Example pattern | Example matches | Example non-matches |
---|---|---|---|---|
\Qx\E | match start to finish | \Qtell\E \Q\d\E | tell \\d | I’ll tell you this I have 5 coins |
(?i)x(?-i). | set the regex string to case-insensitive | (?i)te(?-i) | sT eptE ach | Trench bear |
(?x)x(?-x) | regex ignores whitespace | (?x)t a p(?-x) | tap tap dance | c a t rot a potato |
(?s)x(?-s) | turns on single-line/DOTALL mode which makes the “.” include new-line symbols (\n) in addition to everything else | (?s)first and second(?-s) and third | first and Second and third | first and second and third |
(?m)x(?-m) | changes ^ and $ to be end of line rather than end of string | ^eat and sleep$ | eat and sleep eat and sleep | treat and sleep eat and sleep. |
Unicode
Regular expressions can work beyond the Roman alphabet, with things like Chinese characters or emoji.
- Code Points: The hexadecimal number used to represent an abstract character in a system like unicode.
- Graphemes: Is either a codepoint or a character. All characters are made up of one or more graphemes in a sequence.
Syntax | Description | Example pattern | Example matches | Example non-matches |
---|---|---|---|---|
\X | match graphemes | \u0000gmail | @gmail www.email @gmail | gmail @aol |
\X\X | match special characters like ones with an accent | \u00e8 or \u0065\u0300 | è | e |