Hi.
I’m trying to make a script that creates a textfile or outputs in text the previous end two next episodes of several shows collected fro epguides.com.
So far I have only managed to fetch one site from epguide and filtered out the relevnt lines.
My first obstacel is makig a sed or grep command that removes the tegs from the lines outputed by the above command.
then commes the hard part of selecting the previous, current and next episode. And labeling the shows
All this so that I can have an autoupdated file on my desktop with Geektools.
The current episode is the episode with air date on or before the current date right?
I notice that there are five spaces before the prod # and air date. So, you can use applescript’s text item delimiters to separate the fields. e.g.
set t to do shell script "curl http://epguides.com/PrisonBreak/ | grep '^ *[0-9]'"
set utid to AppleScript's text item delimiters
set AppleScript's text item delimiters to " " -- 5 spaces
set temp_list to text items of t
set AppleScript's text item delimiters to utid
temp_list
Every third item in the list contains the date. You would get every third item, extract the date, coerce to date, compare with today’s date, etc. You can’t do this with grep or sed. grep just returns lines. sed takes fixed patterns.
Finally, you can coerce the date you get from this list to applescript date. For instance, the air date of the first episode is “29 Aug 05”. e.g.
set t to "29 Aug 05"
set d to date t
You can compare this with ‘current date’.
I’m doing something now, but you can start with this unless somebody writes a script for you. Write back for more help.
set t to do shell script "curl http://epguides.com/PrisonBreak/ | grep '^ *[0-9]'"
set p_list to paragraphs of t
set target_dates to {}
set cd to current date
repeat with this_p in p_list
set utid to AppleScript's text item delimiters
set AppleScript's text item delimiters to " " -- 5 spaces
set date_item to last text item of this_p
set AppleScript's text item delimiters to utid
tell date_item
set date_string to word 1 & space & word 2 & space & word 3
end tell
set d to date date_string
set end of target_dates to d
end repeat
target_dates
You don’t really need all the dates, but this is just an example. I wouldn’t know waht to do if there was one episode, the first episode hasn’t played yet, there is no next episode, etc.
thanks guys!
Oe prob though is that I get an error message on your script Kel.
“Invalid date and time date href something”
Is this because it picked the wrong words as date?
EDIT:
It picks the wrong date. As it sets the words as date, not the numbers. So I need to replace the word part of getting the date_string with get numbers or something.
From what I can see the date pulls the wrong words and generates an error message.
I have derfor stripped the html tags from the file, but still can’t select the numbers in the lines.
Code so far:
set t to do shell script "curl http://epguides.com/Lost/ | grep '^ *[0-9]' | sed -e 's/<[^>]*>//g'"
set p_list to paragraphs of t
set target_dates to {}
set cd to current date
repeat with this_p in p_list
set utid to AppleScript's text item delimiters
set AppleScript's text item delimiters to " " -- 5 spaces
set date_item to last text item of this_p
set AppleScript's text item delimiters to utid
tell date_item
set date_string to word 1 & space & word 2 & space & word 3
end tell
tell date_item
set date_string to word 2 & space & word 1 & "," & space & word 3
end tell
set d to date date_string
set end of target_dates to d
end repeat
target_dates
Hi,
you set date_item to the last text item of the string (in the first line (1)), so it couldn’t work.
Parsing string date information and build a date from depends on the date format settings in International PrefPane
This works on my machine with german date format settings
set p_list to paragraphs of (do shell script "curl http://epguides.com/Lost/ | grep '^ *[0-9]' | sed -e 's/<[^>]*>//g'")
set target_dates to {}
set cd to current date
repeat with this_p in p_list
set w to words of this_p
set utid to text item delimiters
set text item delimiters to space
tell w to set date_string to items 5 thru 7 as string
set text item delimiters to utid
set d to date date_string
set end of target_dates to d
end repeat
target_dates
set p_list to paragraphs of (do shell script "curl http://epguides.com/Lost/ | grep '^ *[0-9]' | sed -e 's/<[^>]*>//g'")
tell (current date) to set cd to it - (its time)
repeat with this_p in p_list
set w to words of this_p
set utid to text item delimiters
set text item delimiters to space
tell w to set date_string to items 5 thru 7 as string
set text item delimiters to utid
set d to date date_string
if d < cd then
set target_dates to {d}
else
set end of target_dates to d
exit repeat
end if
end repeat
target_dates
Great!
Now all I have to do is reformat the output:
{date “onsdag 29. september 2004 00:00:0”, date “fredag 6. april 2007 00:00:0”}
to include the whole line that contained the date, append the name of the serie, add all the other shows ad save the whole thing to a text file
NO, not great, the script seems to work but doesn’t.
This is a real working version, which is totally independent of international date format settings.
The result is the two whole lines, not only the date
set theMonths to "JanFebMarAprMayJunJulAugSepOctNovDec"
set p_list to paragraphs of (do shell script "curl http://epguides.com/Lost/ | grep '^ *[0-9]' | sed -e 's/<[^>]*>//g'")
tell (current date) to set cd to it - (its time)
repeat with this_p in p_list
set w to words of this_p
set item 6 of w to ((offset of (item 6 of w) in theMonths) div 3 + 1)
copy cd to d
set day of d to item 5 of w as integer
set month of d to item 6 of w
set year of d to (item 7 of w as integer) + 2000
if d < cd then
set target_dates to {contents of this_p}
else
set end of target_dates to contents of this_p
exit repeat
end if
end repeat
target_dates
set theMonths to "JanFebMarAprMayJunJulAugSepOctNovDec"
set p_list to paragraphs of (do shell script "curl http://epguides.com/PrisonBreak/ | grep '^ *[0-9]' | sed -e 's/<[^>]*>//g' | tail -4 | cut -c1-13,28-54")
tell (current date) to set cd to it - (its time)
repeat with this_p in p_list
set w to words of this_p
set item 5 of w to ((offset of (item 5 of w) in theMonths) div 3 + 1)
copy cd to d
set day of d to item 4 of w as integer
set month of d to item 5 of w
set year of d to (item 6 of w as integer) + 2000
if d < cd then
set target_dates to {contents of this_p}
else
set end of target_dates to contents of this_p
exit repeat
end if
end repeat
target_dates
do shell script "echo 'Prison Break'" & target_dates
And the result is:
"Prison Break 41. 2-19 5 Mar 07 Sweet Caroline 42. 2-20 19 Mar 07 Panama"
How can I get the result to be:
"Prison Break
41. 2-19 5 Mar 07 Sweet Caroline
42. 2-20 19 Mar 07 Panama"
Any one?
PS: added a little more unix so that the curl always outputs a small amount of same formated data.
set theMonths to "JanFebMarAprMayJunJulAugSepOctNovDec"
set p_list to paragraphs of (do shell script "curl http://epguides.com/PrisonBreak/ | grep '^ *[0-9]' | sed -e 's/<[^>]*>//g' | tail -4 | cut -c1-13,28-54")
tell (current date) to set cd to it - (its time)
repeat with this_p in p_list
set w to words of this_p
set item 5 of w to ((offset of (item 5 of w) in theMonths) div 3 + 1)
copy cd to d
set day of d to item 4 of w as integer
set month of d to item 5 of w
set year of d to (item 6 of w as integer) + 2000
if d < cd then
set target_dates to "Prison Break" & return & contents of this_p & return
else
set target_dates to target_dates & contents of this_p & return
exit repeat
end if
end repeat
display dialog target_dates
set theMonths to "JanFebMarAprMayJunJulAugSepOctNovDec"
set p_list to paragraphs of (do shell script "curl -s http://epguides.com/PrisonBreak/ | grep '^ *[0-9]' | sed -e 's/<[^>]*>//g' | tail -8 | cut -c1-13,28-")
tell (current date) to set cd to it - (its time)
repeat with this_p in p_list
set w to words of this_p
set item 5 of w to ((offset of (item 5 of w) in theMonths) div 3 + 1)
copy cd to d
set day of d to item 4 of w as integer
set month of d to item 5 of w
set year of d to (item 6 of w as integer) + 2000
if d < cd then
set target_dates to "Prison Break" & return & contents of this_p & return
else
set target_dates to target_dates & contents of this_p & return
exit repeat
end if
end repeat
target_dates
set p_list to paragraphs of (do shell script "curl -s http://epguides.com/Lost/ | grep '^ *[0-9]' | sed -e 's/<[^>]*>//g' | tail -8 | cut -c1-13,28-")
tell (current date) to set cd to it - (its time)
repeat with this_p in p_list
set w to words of this_p
set item 5 of w to ((offset of (item 5 of w) in theMonths) div 3 + 1)
copy cd to d
set day of d to item 4 of w as integer
set month of d to item 5 of w
set year of d to (item 6 of w as integer) + 2000
if d < cd then
set target_dates to "Lost" & return & contents of this_p & return
else
set target_dates to target_dates & contents of this_p & return
exit repeat
end if
end repeat
target_dates
To output several shows to the same output so that I can cll this script from gektools I only get:
"Lost
60. 3-12 14 Mar 07 Par Avion
61. 3-13 21 Mar 07 The Man From Tallahassee
"
Okay, no problem either,
you can add further shows to ShowList (the spelling must be the same as in the URL)
property theMonths : "JanFebMarAprMayJunJulAugSepOctNovDec"
set ShowList to {"PrisonBreak", "Lost"}
set showDates to ""
repeat with i in ShowList
set showDates to showDates & get_show_data(i)
end repeat
display dialog showDates
on get_show_data(show)
set p_list to paragraphs of (do shell script "curl -s [url=http://epguides.com/]http://epguides.com/"[/url] & show & "/ | grep '^ *[0-9]' | sed -e 's/<[^>]*>//g' | tail -8 | cut -c1-13,28-")
tell (current date) to set cd to it - (its time)
repeat with this_p in p_list
set w to words of this_p
set item 5 of w to ((offset of (item 5 of w) in theMonths) div 3 + 1)
copy cd to d
set day of d to item 4 of w as integer
set month of d to item 5 of w
set year of d to (item 6 of w as integer) + 2000
if d < cd then
set target_dates to show & return & contents of this_p & return
else
set target_dates to target_dates & contents of this_p & return & return
exit repeat
end if
end repeat
return target_dates
end get_show_data