Memory App



Metacharacter | means or. For example Bob and Robert are separate expressions, but Bob|Robert is one regex that matches either.

Character class vs alternation

Looking at the gr[ea]y example, it can be written grey|gray, and even gr(a|e)y. The latter case uses parantheses to constrain the alternation. Note that something like gr[a|e]y is not what we want. Within a class, the | is just a normal character, like a and e. With gr(a|e)y parantheses are required because without them, gra|ey means gra or ey Another example is (First|1st) [Ss]treet. Since both "First" and "1st" end with "st", the combination can be shortened to (Fir|1)st


Although the gr[ea]y match the same as gr(a|e)y be careful not to confuse the concept of alternation with that of a character class. A character class can match just a single character in the target text. With alternation, since each alternative can be a full-fledged regex, each alternative can match an arbitrary amount of text.


Character classes are almost like their own special mini-language (with their own ideas about metacharacters). Alternation is part of the main regex language.
Comments ...