More regex questions?

These questions are still within the framework of Apple Shortcuts

  1. UK Road number designations (as far as I’m aware) are a single letter, A, B, C, D, M followed by a 1, 2, 3 digit number. I want the regex to recognise that pattern so I can do a replace (actually a remove) on it.

  2. I have a UK country field that on occasion gets populated with a UK post code. This is a data error. To get round the issue I want to ‘null’ the field. The easiest way is to say if there is a numeric from 0 to 9 in that field take action and set it to null.
    The problem is that in an apple shortcut, I can only do a ‘contains’ operation. The way round it would be to replace any number in the field with a ‘$$$.’ Then use ‘contains’ to check if the ‘$$$.’ marker exists in the field, if it does then set the field to null. So here I’m looking for the regex to detect a numeric characfer and replace it with the marker?

Can anyone assist? I don’t want to bother @peavine with these questions as he must be tired of me asking such questions :slight_smile:

I wish I could get my head around regex but it is way beyond me.

The UK road number designation can be checked with this regex

[ABCDM][1-9]

it’s simply one letter out of ABCDM followed by one digit out of 1-9

And the UK post code with

[A-Z]{1,2}\d{1,2}([A-Z]{1,2})?\s\d[A-Z]{2}

this is

one or two capital letters
one or two digits
optional one or two capital letters
one whitespace character
one digit
two capital letters

according to https://www.upu.int/UPU/media/upu/PostalEntitiesFiles/addressingUnit/gbrEn.pdf

Per OP, they can have up to 3 digits and can include ‘0’.

However, if you look at a map, you can see roads with 4 digits. None seem to begin with a ‘0’ but many contain one.

1 Like
  • The UK’s A roads can have 1 - 4 digits in their numbers. Additionally, some have “(M)” suffixes to indicate that motorway rules apply on those sections.
  • B road numbers have 3 or 4 digits.
  • Motorway (“M” prefix) numbers have 1 - 3 digits. Additionally, there’s an alternative route to the M6 called the “M6 Toll”, which may be stylised as “M6toll”.
  • Other prefixes exist locally and for cycleways, according to Wikipedia, but I don’t recall ever having seen any and their use on signs is apparently “not advised” in England.

A regex pattern to match these would depend on how likely the target string is to contain a sequence that could be mistaken for a road number or simply how finicky you want to be. Ignoring the possibility of mistaken identity, this would do the trick:

[ABCDHMUQV]\d{1,4}(?:\(M\)| ?[Tt]oll)?

1 Like

Okay starting to look at Post Codes first. This:

Doesn’t make any changes at all. The post codes of ‘WA14 4SJ’ or ‘NE66’ are left unchanged? Have I misunderstood something?

Did you enable Regular Expression in the action?

Sigh… Thanks. I forget that nearly every flipping time. Argh :slight_smile:

It does now work for post codes like ‘WA14 4SJ’ but not shorter ones like ‘NE54’

However if I use ‘[A-Z]{1,2}\d{1,2}’ that seems to recognise longer and shorter post codes? So would that be good to use?

Sure, it considers any string starting with one or two capital letters followed by one or two digits.

Okay I’ll put it into operation and do some testing, then street numbers comes next :slight_smile:
Thank you very much.

Thanks guys, the regex / code for replacing road code designations is also in place.

So both systems are in place, as well as a few other gotchas I had to take care of where the data is in the wrong order :slight_smile:

A big thank you to everyone.

Using [ABCDHMUQV]\d{1,4}(?:(M)| ?[Tt]oll)? is working well with the roads. Can it be adjusted though so that it can remove:

A345 - Roads as it is now
, A345 - Trailing comma and space (from the previous part of the string) then the Road
A345 - Trailing SPACE after the road name

Again I can do these with multiple passes but I thought I’d ask as it would make it easier to work with.:slight_smile:
Thanks as always.

You can add before
\\,?\\s?

After
\\s?

I’m missing something as I don’t; see a \s? in [ABCDHMUQV]\d{1,4}(?:(M)| ?[Tt]oll)? To add it after?

Sorry…

Sorry I didn’t add your full
RegEx. The ? After a match “condition” Marks it as optional

Right, I think… So I should use:

\,?\s?[ABCDHMUQV]\d{1,4}(?:\(M\)| ?[Tt]oll)?

I think I’ve finally got it:


I’ve no ideas why they are on different lines though. It seems as a space being removed is causing a line return?

RegEx
I replaced your literal space with \\s
Literal spaces are bad idea

\,?\s?[ABCDHMUQV]\d{1,4}(?:(M)|\s?[Tt]oll)?\s?

Sample Text

 , A345 - Road
A345-Road
,A345 - Road
,A345-Road
A345 Toll - Road
, 34 - Road

Capture $0

, A345 

A345
,A345 
,A345

A345 Toll 


You should work on your RegEx further.
Everything in M/Toll stuff is all optional and is always matching. You’ll also notice full empty line space / Line Feed being captured. This is the optional match always matching.

Work on line anchors or further conditionals.
I would capture the stuff you wanna keep in group and use that in replacement

Add a line start anchor ^
Before the comma and space

IMG_0428

I’m sorry to say but that is completely beyond me :frowning:

I re-read what you suggested and made the addition. Now half of the replacements don’t take place:

You didn’t copy my RegEx.
Replace the space before toll with \s
And add \s? At the end

You may need to enable anchors match lines mode by adding (?m) to the start

(?m)^\,?\s?[ABCDHMUQV]\d{1,4}(?:(M)|\s?[Tt]oll)?\s?