A regular expression (regex) is a powerful way of specifying a complex search.
MySQL uses Henry Spencer's implementation of regular expressions, which is aimed at conformance with POSIX 1003.2. MySQL uses the extended version.
This is a simplistic reference that skips the details. To get more exact information, see Henry Spencer's regex(7) manual page that is included in the source distribution. See Credits.
A regular expression describes a set of strings. The simplest regexp is one that has no special characters in it. For example, the regexp hello matches hello and nothing else.
Non-trivial regular expressions use certain special constructs so that they can match more than one string. For example, the regexp hello|word matches either the string hello or the string word.
As a more complex example, the regexp B[an]*s matches any of the strings Bananas, Baaaaas, Bs, and any other string starting with a B, ending with an s, and containing any number of a or n characters in between.
A regular expression may use any of the following special characters/constructs:
| ^ |
Match the beginning of a string.
mysql> SELECT "fo\nfo" REGEXP "^fo$"; -> 0
mysql> SELECT "fofo" REGEXP "^fo"; -> 1
| ||||||||||||||||
| $ |
Match the end of a string.
mysql> SELECT "fo\no" REGEXP "^fo\no$"; -> 1
mysql> SELECT "fo\no" REGEXP "^fo$"; -> 0
| ||||||||||||||||
| . |
Match any character (including newline).
mysql> SELECT "fofo" REGEXP "^f.*"; -> 1
mysql> SELECT "fo\nfo" REGEXP "^f.*"; -> 1
| ||||||||||||||||
| a* |
Match any sequence of zero or more a characters.
mysql> SELECT "Ban" REGEXP "^Ba*n"; -> 1
mysql> SELECT "Baaan" REGEXP "^Ba*n"; -> 1
mysql> SELECT "Bn" REGEXP "^Ba*n"; -> 1
| ||||||||||||||||
| a+ |
Match any sequence of one or more a characters.
mysql> SELECT "Ban" REGEXP "^Ba+n"; -> 1
mysql> SELECT "Bn" REGEXP "^Ba+n"; -> 0
| ||||||||||||||||
| a? |
Match either zero or one a character.
mysql> SELECT "Bn" REGEXP "^Ba?n"; -> 1
mysql> SELECT "Ban" REGEXP "^Ba?n"; -> 1
mysql> SELECT "Baan" REGEXP "^Ba?n"; -> 0
| ||||||||||||||||
| de|abc |
Match either of the sequences de or abc.
mysql> SELECT "pi" REGEXP "pi|apa"; -> 1
mysql> SELECT "axe" REGEXP "pi|apa"; -> 0
mysql> SELECT "apa" REGEXP "pi|apa"; -> 1
mysql> SELECT "apa" REGEXP "^(pi|apa)$"; -> 1
mysql> SELECT "pi" REGEXP "^(pi|apa)$"; -> 1
mysql> SELECT "pix" REGEXP "^(pi|apa)$"; -> 0
| ||||||||||||||||
| (abc)* |
Match zero or more instances of the sequence abc.
mysql> SELECT "pi" REGEXP "^(pi)*$"; -> 1
mysql> SELECT "pip" REGEXP "^(pi)*$"; -> 0
mysql> SELECT "pipi" REGEXP "^(pi)*$"; -> 1
| ||||||||||||||||
| {1} , {2,3} | The is a more general way of writing regexps that match many occurrences of the previous atom. |
| |||||||||||||||
| [a-dX] , [^a-dX] |
Matches
any character which is (or is not, if ^ is used) either a, b,
c, d or X. To include a literal ] character,
it must immediately follow the opening bracket [. To include a
literal - character, it must be written first or last. So
[0-9] matches any decimal digit. Any character that does not have
a defined meaning inside a [] pair has no special meaning and
matches only itself.
mysql> SELECT "aXbc" REGEXP "[a-dXYZ]"; -> 1
mysql> SELECT "aXbc" REGEXP "^[a-dXYZ]$"; -> 0
mysql> SELECT "aXbc" REGEXP "^[a-dXYZ]+$"; -> 1
mysql> SELECT "aXbc" REGEXP "^[^a-dXYZ]+$"; -> 0
mysql> SELECT "gheis" REGEXP "^[^a-dXYZ]+$"; -> 1
mysql> SELECT "gheisa" REGEXP "^[^a-dXYZ]+$"; -> 0
| ||||||||||||||||
| [[.characters.]] | The sequence of characters of that collating element. characters is either a single character or a character name like newline. You can find the full list of character names in regexp/cname.h. | ||||||||||||||||
| [=character_class=] | An equivalence class, standing for the sequences of characters of all collating elements equivalent to that one, including itself. For example, if o and (+) are the members of an equivalence class, then [[=o=]], [[=(+)=]], and [o(+)] are all synonymous. An equivalence class may not be an endpoint of a range. | ||||||||||||||||
| [:character_class:] | Within a bracket expression, the name of a character class enclosed in [: and :] stands for the list of all characters belonging to that class. Standard character class names are: |
| |||||||||||||||
| [[:<:]] , [[:>:]] |
These match the null string at the beginning and end of a word
respectively. A word is defined as a sequence of word characters which
is neither preceded nor followed by word characters. A word character is
an alnum character (as defined by ctype(3)) or an underscore
(_).
mysql> SELECT "a word a" REGEXP "[[:<:]]word[[:>:]]"; -> 1 mysql> SELECT "a xword a" REGEXP "[[:<:]]word[[:>:]]"; -> 0 |
mysql> SELECT "weeknights" REGEXP "^(wee|week)(knights|nights)$"; -> 1