Regular Expression Syntax Cheat Sheet

Regular Expression Syntax Cheat Sheet

Regular expressions (regex or regexp) are a pattern of characters that describe an amount of text.

Anchors

Anchors match a position before or after other characters.

SyntaxDescriptionExample patternExample matchesExample non-matches
^match start of line^rrabbit
raccoon
parrot
ferret
$match end of linet$rabbit
foot
trap
star
\Amatch start of line\Arrabbit
raccoon
parrot
ferret
\Zmatch end of linet\Zrabbit
foot
trap
star
\bmatch characters at the start or end of a word\bfox\bthe red fox ran
the fox ate
foxtrot
foxskin scarf
\Bmatch characters in the middle of other non-space characters\Bee\Btrees
beef
bee
tree

Matching types of character

Rather than matching specific characters, you can match specific types of characters such as letters, numbers, and more.

SyntaxDescriptionExample patternExample matchesExample non-matches
.Anything except for a linebreakc.eclean
cheap
acert
cent
\dmatch a digit\d6060-842
2b|^2b
two
**___
\DMatch a non-digit\DThe 5 cats ate
12 Angry men
52
10032
\wMatch word characters\wee\wtrees
bee4
The bee
eels eat meat
\WMatch non-word characters\Wbat\WAt bat
Swing the bat fast
wombat
bat53
\sMatch whitespace\sfox\sthe fox ate
his fox ran
it’s the fox.
foxfur
\SMatch non-whitespace\See\Strees
beef
the bee stung
The tall tree
\metacharacterEscape a metacharacter to match on the metacharacter\.
\^
The cat ate.
2^3
the cat ate
23

Character classes

Character classes are sets or ranges of characters.

SyntaxDescriptionExample patternExample matchesExample non-matches
[xy]match several charactersgr[ea]ygray
grey
green
greek
[x-y]match a range of characters[a-e]amber
brand
fox
join
[^xy]Does not match several charactersgr[^ea]ygreen
greek
gray
grey
[\^-]match metacharacters inside the character class4[\^\.-+*/]\d4^3
4.2
44
23

Repetition

Rather than matching single instances of characters, you can match repeated characters.

SyntaxDescriptionExample patternExample matchesExample non-matches
x*match zero or more timesar*ocacao
carrot
arugula
artichoke
x+match one or more timesre+green
tree
trap
ruined
x?Match zero or one timesro?aroast
rant
root
rear
x{m}match m times\we{2}\wdeer
seer
red
enter
x{m,}match m or more times2{3,}4671-2224
2222224
224
123
x{m,n}match between m and n times12{1,3}31234
1222384
15335
1222223
x*?, x+?, etc.match the minimum number of times - known as a lazy quantifierre+?tree
freeeee
trout
roasted

Capturing, alternation & backreferences

In order to extract specific parts of a string, you can capture those parts, and even name the parts that you captured.

SyntaxDescriptionExample patternExample matchesExample non-matches
(x)capturing a pattern(iss)+Mississippi
missed
mist
persist
(?:x)create a group without capturing(?:ab)(cd)Match:
abcd
Group 1: cd
acbd
(?<name>x)create a named capture group(?<first>\d)(?<scrond>\d)\d*Match: 1325
first: 1
second: 3
2
hello
(x|y)match several alternative patterns(re|ba)red
banter
rant
bear
\nreference previous captures where n is the group index starting at 1(b)(\w*)\1blob
bribe
bear
bring
\k<name>reference named captures(?<first>5)(\d*)\k<first>51245
55
523
51

Lookahead

You can specify that specific characters must appear before or after you match, without including those characters in the match.

SyntaxDescriptionExample patternExample matchesExample non-matches
(?=x)looks ahead at the next characters without using them in the matchan(?=an)
iss(?=ipp)
banana
Mississippi
band
missed
(?!x)looks ahead at next characters to not match onai(?!n)fail
brail
faint
train
(?<=x)looks at previous characters for a match without using those in the match(?<=tr)atrail
translate
bear
streak
(?<!x)looks at previous characters to not match on(?!tr)abear
translate
trail
strained

Literal matches and modifiers

Modifiers are settings that change the way the matching rules work.

SyntaxDescriptionExample patternExample matchesExample non-matches
\Qx\Ematch start to finish\Qtell\E
\Q\d\E
tell
\\d
I’ll tell you this
I have 5 coins
(?i)x(?-i).set the regex string to case-insensitive(?i)te(?-i)sTep
tEach
Trench
bear
(?x)x(?-x)regex ignores whitespace(?x)t a p(?-x)tap
tapdance
c a t
rot a potato
(?s)x(?-s)turns on single-line/DOTALL mode which makes the “.” include new-line symbols (\n) in addition to everything else(?s)first and second(?-s) and thirdfirst and
Second and third
first and
second
and third
(?m)x(?-m)changes ^ and $ to be end of line rather than end of string^eat and sleep$eat and sleep
eat and
sleep
treat and sleep
eat and sleep.

Unicode

Regular expressions can work beyond the Roman alphabet, with things like Chinese characters or emoji.

  • Code Points: The hexadecimal number used to represent an abstract character in a system like unicode.
  • Graphemes: Is either a codepoint or a character. All characters are made up of one or more graphemes in a sequence.
SyntaxDescriptionExample patternExample matchesExample non-matches
\Xmatch graphemes\u0000gmail@gmail
www.email@gmail
gmail
@aol
\X\Xmatch special characters like ones with an accent\u00e8 or \u0065\u0300èe

Resources

  1. https://www.datacamp.com/cheat-sheet/regular-expresso