This example demonstrates how you can add capturing groups to a regular expression to figure out which part of the regular expression found the match. You can find this example as “Capturing groups” in the RegexMagic library.
For this example, we’ll continue with the regular expression created in the example about matching unrelated items using alternation. That regular expression matches a number or an email address. Using this regex with a “find all” command, we can get a list of all numbers and email addresses.
Now we want to use this regex to iterate over all the numbers and email addresses in a file, and we want to separate the numbers and the email addresses, without having to use a second regular expression to check whether the match found by our regular expression is a number or an email address. We can achieve this by placing capturing groups around the parts of the regex that match the number and the email address. Since our regex matches only one number or one email address at a time, only one of the capturing groups will actually capture any text with each regex match. If the group for the number captured text, we know we have a number. If not, the group for the email address will have captured the email address.
You could actually achieve this with just one capturing group for the number. When the group for the number doesn’t capture anything, retrieving the overall regex match gives the email address. But in this example we’ll create two groups just for practice.
# 1. One of the fields 2 to 3 # 2. number: Integer (?<number>[0-9]+) | # 3. email: Email address (?<email>[!#$%&'*+./0-9=?_`a-z{|}~^-]++@[.0-9a-z-]+\.[a-z]{2,63}+)
Required options: Case insensitive; Free-spacing.
Unused options: Dot doesn’t match line breaks; ^$ don’t match at line breaks; Greedy quantifiers.
(?<number>[0-9]+)|(?<email>[\d!#$%&'*+./=?_`a-z{|}~^-]+@[\d.a-z-]+\.[a-z]{2,63})
Required options: Case insensitive.
Unused options: Dot doesn’t match line breaks; ^$ don’t match at line breaks.