MemoryRefresh!

Perl / Lookaround     Lookaround



Questions 1 Answers 0%

Pages   0 from 15
Questions   0 from 20

Reset


Lookaround metasequences
Is a much more general construct than the special word boundary and anchors.
Lookahead
One type of lookaround, called lookahead, peeks forward in the text (toward the right) to see if its subexpression can match. Positive lookahead is specified with the special sequence (?= ), such as with (?=d) which is successful at positions where a digit comes next.
Lookbehind
Another type of lookaround is lookbehind, which look back (toward the left). It's given with the special sequence (?<= ), such as (?<=d), which is successful at positions with a digit to the left.
Position
An important thing to understand is that they don't actually "consume" any text. The regex /Jeffrey/ matches Jeffrey in "Jeffrey Friedl", but the same regex withing lookahead, (?=Jeffrey) matches only the location (or position) before Jeffrey.
Order
It's also important to realize that the order in which they're combined is very important. Jeff(?=Jeffrey) doesn't match "by Jeffrey Friedl", it matches "Jeff" only if followed immediately by "Jeffrey".
Open parenthesis
Therea are a number of special "open parenthesis" sequences, but they all begin with the two-character sequence "(?". We've already seen group-but-don't-capture "(?: )", lookahead "(?= )", lookbehind "(?<= )".
Four types of lookaround
Positive Lookahead (?= ) successful if can MATCH to the RIGHT Positive Lookbehind (?<= ) successful if can MATCH to the LEFT Negative Lookahead (?! ) successful if can NOT match to the RIGHT Negative Lookbehind (?<! ) successful if can NOT match to the LEFT
Common mistake
You might think that D (something not a digit) is the same as (?!d). Remember, with D something is required, while with (?!d) is not
Lookaround Examples 1) Matches "Jeff" only if it is part of "Jeffrey"
$var = "Jeffrey Friedl"; $var =~ s/(?=Jeffrey)(Jeff)/by $1/; print $var; #Outpus: by Jeffrey Friedl
$var = "Thomas Jefferson"; $var =~ s/(?=Jeffrey)(Jeff)/by $1/; print $var; #doesn't match 2) Replace "Jeffs" with "Jeff's" (with lookahead)
$var = "Jeffs articles"; $var =~ s/\bJeff(?=s\b)/Jeff'/g; print $var; #Outputs: Jeff's articles 3) Replace "Jeffs" with "Jeff's" (with lookbehind)
$var = "Jeffs articles"; $var =~ s/(?<=\bJeff)(?=s\b)/'/g; print $var; #Outputs: Jeff's articles 4) Commafying numbers
$var = "The population of 2298444215 is growing"; $var =~ s/(?<=\d)(?=(\d\d\d)+)/,/g; print $var . "\n"; #Outputs: 2,2,9,8,4,4,4,215 4.1) If we add \b it works
$var = "The population of 2298444215 is growing"; $var =~ s/(?<=\d)(?=(\d\d\d)+\b)/,/g; print $var . "\n"; #Outputs: 2,2,9,8,4,4,4,215 4.2) But it doesn't match something like:
$var = "12345Hz"; $var =~ s/(?<=\d)(?=(\d\d\d)+\b)/,/g; print $var . "\n"; #Outputs: 12345Hz 4.3) We use (?!\d) as three digits boundary
$var = "12345Hz"; $var =~ s/(?<=\d)(?=(\d\d\d)+(?!\d))/,/g; print $var . "\n"; #Outputs: 12,345Hz

Related Pages

Non-capturing
Shorthands
Search and replace
Lookaround
Text to html


0% 100%
 
 
0 pages 15 pages
1) Lookahead metasequence




2) Lookbehind metasequence





3) Negated lookahead metasequence








Lookaround metasequences
Is a much more general construct than the special word boundary and anchors.
Lookahead
One type of lookaround, called lookahead, peeks forward in the text (toward the right) to see if its subexpression can match. Positive lookahead is specified with the special sequence (?= ), such as with (?=d) which is successful at positions where a digit comes next.
Lookbehind
Another type of lookaround is lookbehind, which look back (toward the left). It's given with the special sequence (?<= ), such as (?<=d), which is successful at positions with a digit to the left.
Position
An important thing to understand is that they don't actually "consume" any text. The regex /Jeffrey/ matches Jeffrey in "Jeffrey Friedl", but the same regex withing lookahead, (?=Jeffrey) matches only the location (or position) before Jeffrey.
Order
It's also important to realize that the order in which they're combined is very important. Jeff(?=Jeffrey) doesn't match "by Jeffrey Friedl", it matches "Jeff" only if followed immediately by "Jeffrey".
Open parenthesis
Therea are a number of special "open parenthesis" sequences, but they all begin with the two-character sequence "(?". We've already seen group-but-don't-capture "(?: )", lookahead "(?= )", lookbehind "(?<= )".
Four types of lookaround
Positive Lookahead (?= ) successful if can MATCH to the RIGHT Positive Lookbehind (?<= ) successful if can MATCH to the LEFT Negative Lookahead (?! ) successful if can NOT match to the RIGHT Negative Lookbehind (?<! ) successful if can NOT match to the LEFT
Common mistake
You might think that D (something not a digit) is the same as (?!d). Remember, with D something is required, while with (?!d) is not
Lookaround Examples 1) Matches "Jeff" only if it is part of "Jeffrey"
$var = "Jeffrey Friedl"; $var =~ s/(?=Jeffrey)(Jeff)/by $1/; print $var; #Outpus: by Jeffrey Friedl
$var = "Thomas Jefferson"; $var =~ s/(?=Jeffrey)(Jeff)/by $1/; print $var; #doesn't match 2) Replace "Jeffs" with "Jeff's" (with lookahead)
$var = "Jeffs articles"; $var =~ s/\bJeff(?=s\b)/Jeff'/g; print $var; #Outputs: Jeff's articles 3) Replace "Jeffs" with "Jeff's" (with lookbehind)
$var = "Jeffs articles"; $var =~ s/(?<=\bJeff)(?=s\b)/'/g; print $var; #Outputs: Jeff's articles 4) Commafying numbers
$var = "The population of 2298444215 is growing"; $var =~ s/(?<=\d)(?=(\d\d\d)+)/,/g; print $var . "\n"; #Outputs: 2,2,9,8,4,4,4,215 4.1) If we add \b it works
$var = "The population of 2298444215 is growing"; $var =~ s/(?<=\d)(?=(\d\d\d)+\b)/,/g; print $var . "\n"; #Outputs: 2,2,9,8,4,4,4,215 4.2) But it doesn't match something like:
$var = "12345Hz"; $var =~ s/(?<=\d)(?=(\d\d\d)+\b)/,/g; print $var . "\n"; #Outputs: 12345Hz 4.3) We use (?!\d) as three digits boundary
$var = "12345Hz"; $var =~ s/(?<=\d)(?=(\d\d\d)+(?!\d))/,/g; print $var . "\n"; #Outputs: 12,345Hz

References









Programming

Php
   
Regex
   
MySQL
   
Css
   
jQuery
   
Git
   


References