Is there a better RegEx for this?

OK. Here’s a version that’s more flexible about separators:

(* 
Regular Expressions by James Sehrier on Keyboard Maestro list that Adam Bell has adjusted from that format to Satimage regexp AppleScript format. Modified by Yvan Koenig to return four-digit years. Adapted thence for sed by NG.
*)

set myDates to {"3/30/10", "9.11.2012", "1-1-00", "2:2:2", "31 12 43", "31_12_1943", "9a3b1949", "7c7d7"}

set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to return
set myDates to myDates as text
set AppleScript's text item delimiters to astid

--> get dates to date that is (0)mm/(0)dd/(yy)yy format
set regEx_1 to "([0-9]+[^0-9])([0-9]+[^0-9])([0-9]+)"
set Replace_1 to "0\\10\\220\\3"

-- Insert an extra "0" into any year which begins with "20" and only has 3 digits.
set regEx_1a to "([^0-9])20([0-9]([^0-9]\\|$))"
set Replace_1a to "\\1200\\2"

--> then fix for mm/dd/yy format
set regEx_2 to "[0-9]*([0-9]{2}[^0-9])[0-9]*([0-9]{2}[^0-9])[0-9]*([0-9]{4})"
set Replace_2 to "\\1\\2\\3"

set newDates to paragraphs of (do shell script ("echo " & myDates & " | sed -Ee 's|" & regEx_1 & "|" & Replace_1 & "|g'  -e 's|" & regEx_1a & "|" & Replace_1a & "|g' -e 's|" & regEx_2 & "|" & Replace_2 & "|g'"))
--> {"03/30/2010", "09.11.2012", "01-01-2000", "02:02:2002", "31 12 2043", "31_12_1943", "09a03b1949", "07c07d2007"}

Nigel’s solution directly above neatly solves the approach I had taken. Unlike him, I’m hopeless so far with sed. As you can see, I just added a bit to fix the date order. A work in progress that will next incorporate Nigel’s fix.


-- Nigel's version of RegExDates converted to handler with choice by acb

(* 
Regular Expressions by James Sehrier on Keyboard Maestro list that Yvan Koenig has adjusted from that format to Satimage regexp AppleScript format. Adjusted thence for sed by NG.
*)

set myDates to {"3/30/10", "9/11/2012", "1/1/00", "2/2/2", "31/12/43", "31/12/1943", "9/3/1949", "7/7/7"}

NormalizeDates(myDates, "FS", ".") -- a list, type of presentation, delimiter preferred

to NormalizeDates(dateList, Dtype, delim) -- Dtypes: "FS" or "ddmmyyyy"
	-- replace the list variable with paragraphs of a single string for regex analysis
	set astid to AppleScript's text item delimiters
	set AppleScript's text item delimiters to return
	set myDates to dateList as text
	set AppleScript's text item delimiters to astid
	
	--> get dates to date that is (0)mm/(0)dd/(yy)yy format
	set regEx_1 to "([0-9]+/)([0-9]+/)([0-9])"
	set Replace_1 to "0\\10\\220\\3"
	
	-- Insert an extra "0" into any year which begins with "20" and only has 3 digits.
	set regEx_1a to "/20([0-9]([^0-9]\\|$))"
	set Replace_1a to "/200\\1"
	
	--> then fix for mm/dd/yy format
	set regEx_2 to "[0-9]*([0-9]{2}/)[0-9]*([0-9]{2}/)[0-9]*([0-9]{4})"
	set Replace_2 to "\\1\\2\\3"
	
	set newDates to paragraphs of (do shell script ("echo " & myDates & " | sed -Ee 's|" & regEx_1 & "|" & Replace_1 & "|g'  -e 's|" & regEx_1a & "|" & Replace_1a & "|g' -e 's|" & regEx_2 & "|" & Replace_2 & "|g'"))
	
	--> {"03/30/2010", "09/11/2012", "01/01/2000", "02/02/2002", "31/12/2043", "31/12/1943", "09/03/1949", "07/07/2007"}
	
	if Dtype is "FS" then
		-- reverse date order in newDates
		set yearFirst to {}
		repeat with oneDate in newDates
			set end of yearFirst to my ReverseDateOrder(oneDate, delim)
		end repeat
		set newDates to yearFirst
	end if
	return newDates
end NormalizeDates

to ReverseDateOrder(dateText, delim)
	set tid to AppleScript's text item delimiters
	set AppleScript's text item delimiters to "/"
	set Rev to reverse of (text items of dateText)
	set AppleScript's text item delimiters to delim
	return Rev as text
	set AppleScript's text item delimiters to tid
end ReverseDateOrder

Hi, Adam. Ever more flexible! :slight_smile:

With a slight change to regEx_2, it’s possible to achieve reversals and separator impositions by supplying an appropriate Replace_2:

(* 
Regular Expressions by James Sehrier on Keyboard Maestro list that Adam Bell has adjusted from that format to Satimage regexp AppleScript format. Modified by Yvan Koenig to return four-digit years. Adapted thence for sed by NG. Converted to handler with choice by acb. Further fooled with by NG.
*)

to NormalizeDates(myDates, params)
	set {fs:fs, delim:delim} to (params & {fs:false, delim:missing value})
	
	set astid to AppleScript's text item delimiters
	set AppleScript's text item delimiters to return
	set myDates to myDates as text
	set AppleScript's text item delimiters to astid
	
	--> get dates to date that is (0)mm/(0)dd/(yy)yy format
	set regEx_1 to "([0-9]+[^0-9])([0-9]+[^0-9])([0-9]+)"
	set Replace_1 to "0\\10\\220\\3"
	
	-- Insert an extra "0" into any year which begins with "20" and only has 3 digits.
	set regEx_1a to "([^0-9])20([0-9]([^0-9]\\|$))"
	set Replace_1a to "\\1200\\2"
	
	--> then fix for mm/dd/yyyy format and handle reversal and/or separator imposition as specified.
	set regEx_2 to "[0-9]*([0-9]{2})([^0-9])[0-9]*([0-9]{2})([^0-9])[0-9]*([0-9]{4})"
	if (delim is missing value) then -- Use existing separators.
		if (fs) then -- Reverse the element order.
			set Replace_2 to "\\5\\4\\3\\2\\1"
		else -- Maintain the element order.
			set Replace_2 to "\\1\\2\\3\\4\\5"
		end if
	else -- Use the parameter-specified separator.
		if (fs) then -- Reverse the element order.
			set Replace_2 to "\\5" & delim & "\\3" & delim & "\\1"
		else -- Maintain the element order.
			set Replace_2 to "\\1" & delim & "\\3" & delim & "\\5"
		end if
	end if
	
	set newDates to paragraphs of (do shell script ("echo " & myDates & " | sed -Ee 's|" & regEx_1 & "|" & Replace_1 & "|g'  -e 's|" & regEx_1a & "|" & Replace_1a & "|g' -e 's|" & regEx_2 & "|" & Replace_2 & "|g'"))
end NormalizeDates

set myDates to {"3/30/10", "9.11.2012", "1-1-00", "2:2:2", "31 12 43", "31_12_1943", "9a3b1949", "7c7d7"}

NormalizeDates(myDates, {}) -- Defaults: same ddmmyyyy or mmddyyyy order, same separator(s).
--> {"03/30/2010", "09.11.2012", "01-01-2000", "02:02:2002", "31 12 2043", "31_12_1943", "09a03b1949", "07c07d2007"}
NormalizeDates(myDates, {fs:true}) -- yyyymmdd (or yyyyddmm, needs more work for US computers).
--> {"2010/30/03", "2012.11.09", "2000-01-01", "2002:02:02", "2043 12 31", "1943_12_31", "1949b03a09", "2007d07c07"}
NormalizeDates(myDates, {delim:"."}) -- Impose the separator ".".
--> {"03.30.2010", "09.11.2012", "01.01.2000", "02.02.2002", "31.12.2043", "31.12.1943", "09.03.1949", "07.07.2007"}
NormalizeDates(myDates, {fs:true, delim:"."}) -- Reverse AND impose separator.
--> {"2010.30.03", "2012.11.09", "2000.01.01", "2002.02.02", "2043.12.31", "1943.12.31", "1949.03.09", "2007.07.07"}

If you wanted the possibility of converting US short dates to yyyymmdd, you could perhaps modify the script so that the fs value, instead of being simply true or false, could be something which told the handler whether the input dates were in ddmmyyyy or mmddyyyy order.

Quite slick. I haven’t used a parameter list like that before, either. Neat.