Does anyone have any good ideas how to search for a date and time in a large block of text, like an email? I have some reasonable script that recognizes standard apple date formats as well as today, tomorrow, this friday, next thu, etc. so now I want it to be able to pick out the date from a random bunch of text.
I was thinking of trying this code on ever group of 2 or 3 words in the file, and if it works, to assume that’s a date. Then I’d look for a “PM” or “AM” or “:” or “o’clock” or “at” in the 5 words or so before and after where the date was found. But if anyone has other ideas, that’d be much appreciated. I’d love to get something working like gmail where it recognizes date in your text for adding to the calendar.
Here’s the date stuff I have so far:
property wdays : {"Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"}
set dateset to false
set pretext to ""
set todayd to my dayNumber((weekday of (current date)) as string)
repeat while not dateset
set datet to text returned of (display dialog pretext & "What date do you want to set this event for?" default answer "tomorrow" with title "Event Date")
set errdate to false
set foundd to my dayNumber(datet)
try
set foundd2 to my dayNumber(word 2 of datet)
on error
set foundd2 to -1
end try
if datet is "tomorrow" then
set mydate to (current date) + days
else if datet is "today" then
set mydate to (current date)
else if foundd is not -1 then
set mydate to (current date) + days * ((foundd - todayd + 7) mod 7)
else if foundd2 is not -1 and word 1 of datet is "next" then
set mydate to (current date) + days * (((foundd2 - todayd + 7) mod 7) + 7)
else
try
set mydate to date datet
on error
set pretext to "That was not a valid date format. Try something like 9/18, tuesday, next friday, or Sept 18.
"
set errdate to true
end try
end if
set dateset to not errdate
end repeat
log mydate
set timet to text returned of (display dialog "What time do you want to set this event for?" default answer "12:00 PM" with title "Event Date")
set timed to (date ("9/9/09 " & timet))
log timed
set time of mydate to time of timed
log mydate
on dayNumber(day)
set foundd to -1
repeat with whichd from 1 to count of wdays
ignoring case
if day is item whichd of wdays then set foundd to whichd
end ignoring
end repeat
if foundd > 7 then set foundd to foundd - 7
log day & " = " & foundd
return foundd
end dayNumber
That’s a formidable task because the emails will not necessarily use any reasonable format for a date and you’re not sure whether the date reference is an event to be recorded or something else:
… if you recall, in last September’s meeting, we …
… is next Tuesday ok for you?
… Thank God it’s Friday. I couldn’t take another week like this one.
… can we schedule a meeting sometime in the week before the October Sales Meeting? I really …
… He’s away just now - left two days ago for an 18-day cruise.
… It’s been a blue Monday. I hope …
… Can we meet at Ruby Tuesday’s? It’s just outside the mall door.
Thanks for the reply Adam. Yes, that’s true, there may be a lot of other date references, but I am only worried about trying to recognize them when it’s clear and the user invokes my script to create a new event in iCal, so it will limit the search to dates that must be in the text, and in the future.
As an indicator of what a task it would be, look at this for just one instance: a weekday preceded by next. Even then, it’s sort of wrong, because the average person referring to a Thursday this week wouldn’t say ‘next’, they would say ‘this’, but the script would work as we expected if the message was for next Monday. We’d have to add a check to see if Thursday was in this week after today. You see how it builds up? The handler just finds the next weekday of a given name after today, not the one in the next week. English can be rather imprecise.
property tDays : {Wednesday, Thursday, Friday, Saturday, Sunday, Monday, Tuesday} -- no quotes
property astid : AppleScript's text item delimiters
set emText to "Hi, John;
Sure, next Thursday at 2 is fine with me.
Regards;
Jim"
repeat with aWD in tDays
if (aWD as text) is in emText then
set AppleScript's text item delimiters to aWD as text
if last word of text item 1 of emText is "next" then
set theWkDay to getNextWkDay(aWD as text, current date)
set AppleScript's text item delimiters to astid
exit repeat
end if
end if
set AppleScript's text item delimiters to astid
end repeat
to getNextWkDay(wkDay, theDate) -- adapted from a script by Nigel Garvey
set keyDate to date "Wednesday, January 1, 1000 12:00:00 AM"
repeat with k from 1 to 7
if item k of tDays as string = wkDay then
set wkd to item k of tDays
exit repeat
end if
end repeat
set tWD to keyDate + (k - 1) * days
return theDate - (theDate - tWD) mod weeks + weeks
end getNextWkDay
Thanks for the code snippet. I’m not so worried about the semantics of “next thursday”. I can leave that up to the user, as they will be able to verify each detected date, but I was more asking what everyone thought was a good way to go about detecting valid dates in a text block- either as you have done by looking for weekdays, months, “/”'s as in 9/24/06 etc. or
running the code I had above on all of the 1, 2, and 3 word groups from the text block until you find a valid date.