Programming
Memory App
Regex



Lookaround






Lookahead

One type of lookaround, called lookahead, peeks forward in the text (toward the right) to see if its subexpression can match. Positive lookahead is specified with the special sequence (?= ), such as with (?=d) which is successful at positions where a digit comes next.

Lookbehind

Another type of lookaround is lookbehind, which look back (toward the left). It's given with the special sequence (?<= ), such as (?<=d), which is successful at positions with a digit to the left.

Position

An important thing to understand is that they don't actually "consume" any text. The regex /Jeffrey/ matches Jeffrey in "Jeffrey Friedl", but the same regex withing lookahead, (?=Jeffrey) matches only the location (or position) before Jeffrey.

Order

It's also important to realize that the order in which they're combined is very important. Jeff(?=Jeffrey) doesn't match "by Jeffrey Friedl", it matches "Jeff" only if followed immediately by "Jeffrey".

Open parenthesis

There are a number of special open parenthesis sequences, but they all begin with the two-character sequence (?. We've already seen group-but-don't-capture (?: ), lookahead (?= ), lookbehind (?<= ).

Four types of lookaround

Positive Lookahead (?= ) successful if can MATCH to the RIGHT Positive Lookbehind (?<= ) successful if can MATCH to the LEFT Negative Lookahead (?! ) successful if can NOT match to the RIGHT Negative Lookbehind (?<! ) successful if can NOT match to the LEFT

Common mistake

You might think that \D (something not a digit) is the same as (?!d). Remember, with \D something is required, while with (?!d) is not.

Lookaround Examples

1) Matches "Jeff" only if it is part of "Jeffrey" $var = "Jeffrey Friedl"; $var =~ s/(?=Jeffrey)(Jeff)/by $1/; print $var; #Outpus: by Jeffrey Friedl regexpal.com/?fam=111484 // Jeff Friedl doesn't match 2) Replace "Jeffs" with "Jeff's" // with lookahead $var = "Jeffs articles"; $var =~ s/\bJeff(?=s\b)/Jeff'/g; print $var; // Outputs: Jeff's articles // with lookbehind $var = "Jeffs articles"; $var =~ s/(?<=\bJeff)(?=s\b)/'/g; print $var; // Outputs: Jeff's articles regexpal.com/?fam=111485 3) Commafying numbers $var = "The population of 2298444215 is growing"; $var =~ s/(?<=\d)(?=(\d\d\d)+)/,/g; print $var . "\n"; //Outputs: 2,2,9,8,4,4,4,215 // If we add \b it works $var = "The population of 2298444215 is growing"; $var =~ s/(?<=\d)(?=(\d\d\d)+\b)/,/g; print $var . "\n"; //Outputs: 2,2,9,8,4,4,4,215 // But it doesn't match something like: $var = "12345Hz"; $var =~ s/(?<=\d)(?=(\d\d\d)+\b)/,/g; print $var . "\n"; //Outputs: 12345Hz // We use (?!\d) as three digits boundary $var = "12345Hz"; $var =~ s/(?<=\d)(?=(\d\d\d)+(?!\d))/,/g; print $var . "\n"; //Outputs: 12,345Hz
Comments
Comments ...