Regular Expressions in a Shortcut

Regular expressions (regexes) are supported in shortcuts and are useful in manipulating text strings. I thought I would briefly explain their operation and provide a few examples.

By way of preface, the regexes used in shortcuts follow the ICU standard, which can be found at:

Two actions are used in a shortcut to implement a regex. The first is the Match Text action, which returns a list of strings that match a regex pattern. In the following example, the regex pattern finds 3 digits, followed by a literal dash, followed by 4 digits. The shortcut returns a list of 2 matches.

The second regex action is Replace Text, which does a search and replace. In the following example, substrings that match the regex pattern are replaced with the text [redacted].

Capture groups are a useful regex feature, and they match and return the portion of a regex pattern contained within parentheses. The text matched by capture groups is returned in a Replace Text action as $1, $2, and so on. The text matched by a capture group in a Match Text action is returned by a Get Group from Matched Text action.

This example uses a capture group in a Replace Text action to return the substring contained within quotation marks. It’s important to note that the entire string is matched and is replaced with the hello john greeting. This may seem to be of little use, but I use it frequently.

In the above example, hello John is returned instead of hello Jane because the .* metacharacters are made lazy by appending the ? metacharacter. The terms lazy and greedy are an important concept in regexes, and a Google search will yield many good explanations.

This example uses a capture group in a Match Text action to return the substrings contained within quotation marks. A list of two items results.

A few additional thoughts:

  • Replace Text actions are almost always faster then Match Text actions, although both are very quick.

  • It is not always possible to accomplish a particular task with one regex action, and two or more consecutive regex actions can be used instead.

  • Literal characters can be included in a regex pattern, but the following characters must be escaped with a backslash to retain their literal meaning: * ? + [ ( ) { } ^ $ | \ .

  • If a regex pattern contains an error, the shortcut will report an error. If a regex pattern does not find a match, the entire original string is often returned.