
- #Regex escape characters how to
- #Regex escape characters series
However, the specifications for some regular expression implementations (POSIX for example), state in their documentation that when you escape a character that doesn't need to be escaped, the result will be 'undefined behaviour'. For others, they don't need to be escaped, but they'll just be treated as literal characters anyway.
8) ' characters even if you don't need to: For some regular expression implementations, these characters need to be escaped. The '?' character also has another special meaning (that hasn't been discussed yet) when it's used as the first character inside a set of parentheses. 7) ? This character must be escaped since it represents the 'zero or one' quantifier. 6) + This character must be escaped since it represents the 'one or more' quantifier. 5) * This character must be escaped since it represents the 'zero or more' quantifier. 4) ^ The '^' character needs to be escaped outside a character class too, but for a different reason (which will be discussed later).Īs you can see, the list of characters that need to be escaped outside a character class includes all the characters that need to be escaped inside a character class (except for the '-' character).īut what other characters need to be escaped outside a character class? Unsurprisingly, you also need to escape the characters that are used to represent quantifiers:. Therefore, the '-' character does not need to be escaped outside a character class. ) - You can't specify character ranges outside a character class. Therefore, it must always be escaped when specifying a literal '\' character.
3) \ The '\' character is used to specify escaped characters both inside and outside a character class. Some environments (such as in the Ruby programming language) will require that you escape it. 2) ] Since this character is used to close a character class, some regular expression flavours will let you get away with not escaping a ']' as long as it's outside a character class. 1) [ This character obviously needs to be escaped outside a character class, otherwise, you'd never be able to define a character class!. 1) [ Purpose: Start of character class.Ĭharacter Escaping Outside A Character Classīut what can we say about characters that always need to be escaped outside a character class? For starters, let's consider the 5 characters from the list above, and take note of a few differences:. In the sections Character Classes in Regular Expressions - A Gentle Introduction and Character Ranges & Class Negation in Regular Expressions we reviewed the 5 characters that need to be escaped inside a character class (anywhere inside ):
Character Escaping Inside A Character Class To keep this simple, this article will not consider Unicode characters and instead focus only on single-byte ASCII characters.
#Regex escape characters how to
When learning how to correctly escape characters in a regular expression, it helps to divide the escaping rules into two different lists of characters that need to be escaped: One for characters inside a character class, and one for characters outside a character class. In this section, we'll try to gain a fairly comprehensive understanding of the most common ways to escape characters in a regular expression. There's just one remaining major common cause of confusion: character escaping. You'll be pleased to know that you're now very close to being able to read or write almost any regular expression. In the last few sections of this series, we've gained an understanding of character classes, quantifiers and alternation.
#Regex escape characters series
This article is part of a Series On Regular Expressions. Regular Expression Character Escaping - By Robert Elder