Expert Refresh


1) Regex to match <H1>, <H2 >

2) Regex to math <HR> or <HR SIZE=1> (size optional)

3) Regex to match US stock ticker (one to five letters)

Plus (+) and star (*)
Similar to the ? (question mark) are + (plus) and * (star). Metacharacter + means "one or more of the immediately-preceding item". Metacharacter * means "any (none or more of) the immediately-preceding item".
Example <H1>
HTML specifications says that spaces are allowed before the closing >. Regex /<H[1-6] *>/ matches <H1>, but also <H1 > or <H1 > (with any number of spaces).
Example <HR SIZE=14>
Regex /<HR +SIZE *= *[0-9]+ *>/ matches <HR SIZE=1>, but also <HR SIZE = 14 > (with any number of spaces). Our eyes has always been trained to treat spaces specially. That's a habit you will have to break when reading regex. The space character is a normal character, no different from say, j or 4.
Example (SIZE optional)
If we want SIZE to be optional the regex will be /<HR( +SIZE *= *[0-9]+)? *>/ Note that the ending " *" is kept outside group parantheses. This still allows something such as <HR >. Note that the first " +" is included withing the parantheses. Otherwise, a space would have been required after the HR. This would cause <HR> not to match.
Interval quantifier {min, max}
You might use [a-zA-Z]{1,5} to match a US stock ticker (from one to five letters). Notation {0,1} is the same as ? (question mark). If only a single number is given (such as [a-z]{3}), it matches exactly that many of the item.