Memory App



Parentheses are used to limit the scope of alternation, and to group multiple characters to witch you can apply quantifiers. In many regular-expression flavors, parentheses can remember text matched by the subexpression they enclose. Wouldn't it be nice if we could match one generic word, and then say now match the same thing again?


Backreferencing is a regular-expression feature that allows you to match new text that is the same as some text matched earlier in the expression.

Example - double double

We start with \b(the) +(the)\b and replace "the" with a regex to match a general word, say [A-Za-z]+. Finaly we replace the second word with the metasequence \1. New regex \b([a-zA-Z]+) +\1\b matches anyword anyword. Of course, you can have more than one set of parentheses. Use \1, \2, etc to refer to first, second, etc sets. Since egrep considers each line in isolation, it isn't able to find when the ending word of one line is repeated at the beginning of the next.

The great escape

How you actually match a characted that a regex would normaly interpret as metacharacter. We use backslahses. The metasequence to match a dot is a dot preceded by a backslash (\.). Another example, regex \([a-zA-Z]+\) matches (very).
Comments ...