Script that fetches tv-episode time scheduel

Hi.
I’m trying to make a script that creates a textfile or outputs in text the previous end two next episodes of several shows collected fro epguides.com.
So far I have only managed to fetch one site from epguide and filtered out the relevnt lines.

curl http://epguides.com/PrisonBreak/ | grep ‘^ *[0-9]’

So I have a long way to go…

My first obstacel is makig a sed or grep command that removes the tegs from the lines outputed by the above command.
then commes the hard part of selecting the previous, current and next episode. And labeling the shows

All this so that I can have an autoupdated file on my desktop with Geektools.

Anyone?

Hi Laner,

The current episode is the episode with air date on or before the current date right?

I notice that there are five spaces before the prod # and air date. So, you can use applescript’s text item delimiters to separate the fields. e.g.


set t to do shell script "curl http://epguides.com/PrisonBreak/ | grep '^ *[0-9]'"
set utid to AppleScript's text item delimiters
set AppleScript's text item delimiters to "     " -- 5 spaces
set temp_list to text items of t
set AppleScript's text item delimiters to utid
temp_list

Every third item in the list contains the date. You would get every third item, extract the date, coerce to date, compare with today’s date, etc. You can’t do this with grep or sed. grep just returns lines. sed takes fixed patterns.

Finally, you can coerce the date you get from this list to applescript date. For instance, the air date of the first episode is “29 Aug 05”. e.g.


set t to "29 Aug 05"
set d to date t

You can compare this with ‘current date’.

I’m doing something now, but you can start with this unless somebody writes a script for you. Write back for more help.

gl,

No, that won’t work because the third item contains part of the next paragraph. Might be better to get paragraphs or search for the dates.

gl,

Here’s an example of getting the dates.


set t to do shell script "curl http://epguides.com/PrisonBreak/ | grep '^ *[0-9]'"
set p_list to paragraphs of t
set target_dates to {}
set cd to current date
repeat with this_p in p_list
	set utid to AppleScript's text item delimiters
	set AppleScript's text item delimiters to "     " -- 5 spaces
	set date_item to last text item of this_p
	set AppleScript's text item delimiters to utid
	tell date_item
		set date_string to word 1 & space & word 2 & space & word 3
	end tell
	set d to date date_string
	set end of target_dates to d
end repeat
target_dates

You don’t really need all the dates, but this is just an example. I wouldn’t know waht to do if there was one episode, the first episode hasn’t played yet, there is no next episode, etc.

gl,

thanks guys!
Oe prob though is that I get an error message on your script Kel.
“Invalid date and time date href something”
Is this because it picked the wrong words as date?

EDIT:
It picks the wrong date. As it sets the words as date, not the numbers. So I need to replace the word part of getting the date_string with get numbers or something.

Hi Laner,

Maybe your applescript version doesn’t allow day, month, year, so try switching to:

tell date_item
	set date_string to word 2 & space & word 1 & "," & space & word 3
end tell

This would be month, day, year. For example,

“jan 1, 07”

gl,

I think the script return wrong date.

current date
date “onsdag 7. mars 2007 14:38:51”
“Invalid date and time date = href, http.”

From what I can see the date pulls the wrong words and generates an error message.
I have derfor stripped the html tags from the file, but still can’t select the numbers in the lines.

Code so far:

set t to do shell script "curl http://epguides.com/Lost/ | grep '^ *[0-9]' | sed -e 's/<[^>]*>//g'"
set p_list to paragraphs of t
set target_dates to {}
set cd to current date
repeat with this_p in p_list
	set utid to AppleScript's text item delimiters
	set AppleScript's text item delimiters to " " -- 5 spaces
	set date_item to last text item of this_p
	set AppleScript's text item delimiters to utid
	tell date_item
		set date_string to word 1 & space & word 2 & space & word 3
	end tell
	tell date_item
		set date_string to word 2 & space & word 1 & "," & space & word 3
	end tell
	set d to date date_string
	set end of target_dates to d
end repeat
target_dates

Hi,
you set date_item to the last text item of the string (in the first line (1)), so it couldn’t work.
Parsing string date information and build a date from depends on the date format settings in International PrefPane
This works on my machine with german date format settings

set p_list to paragraphs of (do shell script "curl http://epguides.com/Lost/ | grep '^ *[0-9]' | sed -e 's/<[^>]*>//g'")
set target_dates to {}
set cd to current date
repeat with this_p in p_list
	set w to words of this_p
	set utid to text item delimiters
	set text item delimiters to space
	tell w to set date_string to items 5 thru 7 as string
	set text item delimiters to utid
	set d to date date_string
	set end of target_dates to d
end repeat
target_dates

the format of my “cd” variable is:
date “onsdag 7. mars 2007 23:11:37”

the result of the script is just a list of ates for me.
How do I set the target_date correctly?

Sorry, I didn’t get the plot of the whole thing.
Which date(s) depending on current date do you want to filter?

I want the date for the currentently aired show and the next one.

Okay, what’s about this:

set p_list to paragraphs of (do shell script "curl http://epguides.com/Lost/ | grep '^ *[0-9]' | sed -e 's/<[^>]*>//g'")
tell (current date) to set cd to it - (its time)
repeat with this_p in p_list
	set w to words of this_p
	set utid to text item delimiters
	set text item delimiters to space
	tell w to set date_string to items 5 thru 7 as string
	set text item delimiters to utid
	set d to date date_string
	if d < cd then
		set target_dates to {d}
	else
		set end of target_dates to d
		exit repeat
	end if
end repeat
target_dates

Great!
Now all I have to do is reformat the output:
{date “onsdag 29. september 2004 00:00:0”, date “fredag 6. april 2007 00:00:0”}
to include the whole line that contained the date, append the name of the serie, add all the other shows ad save the whole thing to a text file :slight_smile:

NO, not great, the script seems to work but doesn’t. :confused:
This is a real working version, which is totally independent of international date format settings.
The result is the two whole lines, not only the date

set theMonths to "JanFebMarAprMayJunJulAugSepOctNovDec"
set p_list to paragraphs of (do shell script "curl http://epguides.com/Lost/ | grep '^ *[0-9]' | sed -e 's/<[^>]*>//g'")
tell (current date) to set cd to it - (its time)
repeat with this_p in p_list
	set w to words of this_p
	set item 6 of w to ((offset of (item 6 of w) in theMonths) div 3 + 1)
	copy cd to d
	set day of d to item 5 of w as integer
	set month of d to item 6 of w
	set year of d to (item 7 of w as integer) + 2000
	if d < cd then
		set target_dates to {contents of this_p}
	else
		set end of target_dates to contents of this_p
		exit repeat
	end if
end repeat
target_dates

This is looking good. So now all I have to do is repeate it and add the diffrent URL’s :slight_smile:

This is where I’m at now:

set theMonths to "JanFebMarAprMayJunJulAugSepOctNovDec"
set p_list to paragraphs of (do shell script "curl http://epguides.com/PrisonBreak/ | grep '^ *[0-9]' | sed -e 's/<[^>]*>//g' | tail -4 | cut -c1-13,28-54")
tell (current date) to set cd to it - (its time)
repeat with this_p in p_list
	set w to words of this_p
	set item 5 of w to ((offset of (item 5 of w) in theMonths) div 3 + 1)
	copy cd to d
	set day of d to item 4 of w as integer
	set month of d to item 5 of w
	set year of d to (item 6 of w as integer) + 2000
	if d < cd then
		set target_dates to {contents of this_p}
	else
		set end of target_dates to contents of this_p
		exit repeat
	end if
end repeat
target_dates
do shell script "echo 'Prison Break'" & target_dates

And the result is:

"Prison Break 41. 2-19 5 Mar 07 Sweet Caroline 42. 2-20 19 Mar 07 Panama"

How can I get the result to be:

"Prison Break 
41. 2-19 5 Mar 07 Sweet Caroline 
42. 2-20 19 Mar 07 Panama"

Any one?

PS: added a little more unix so that the curl always outputs a small amount of same formated data.

like this:

set theMonths to "JanFebMarAprMayJunJulAugSepOctNovDec"
set p_list to paragraphs of (do shell script "curl http://epguides.com/PrisonBreak/ | grep '^ *[0-9]' | sed -e 's/<[^>]*>//g' | tail -4 | cut -c1-13,28-54")
tell (current date) to set cd to it - (its time)
repeat with this_p in p_list
	set w to words of this_p
	set item 5 of w to ((offset of (item 5 of w) in theMonths) div 3 + 1)
	copy cd to d
	set day of d to item 4 of w as integer
	set month of d to item 5 of w
	set year of d to (item 6 of w as integer) + 2000
	if d < cd then
		set target_dates to "Prison Break" & return & contents of this_p & return
	else
		set target_dates to target_dates & contents of this_p & return
		exit repeat
	end if
end repeat

display dialog target_dates

Closing in on it!

But when i do this:

set theMonths to "JanFebMarAprMayJunJulAugSepOctNovDec"
set p_list to paragraphs of (do shell script "curl -s http://epguides.com/PrisonBreak/ | grep '^ *[0-9]' | sed -e 's/<[^>]*>//g' | tail -8 | cut -c1-13,28-")
tell (current date) to set cd to it - (its time)
repeat with this_p in p_list
	set w to words of this_p
	set item 5 of w to ((offset of (item 5 of w) in theMonths) div 3 + 1)
	copy cd to d
	set day of d to item 4 of w as integer
	set month of d to item 5 of w
	set year of d to (item 6 of w as integer) + 2000
	if d < cd then
		set target_dates to "Prison Break" & return & contents of this_p & return
	else
		set target_dates to target_dates & contents of this_p & return
		exit repeat
	end if
end repeat
target_dates

set p_list to paragraphs of (do shell script "curl -s http://epguides.com/Lost/ | grep '^ *[0-9]' | sed -e 's/<[^>]*>//g' | tail -8 | cut -c1-13,28-")
tell (current date) to set cd to it - (its time)
repeat with this_p in p_list
	set w to words of this_p
	set item 5 of w to ((offset of (item 5 of w) in theMonths) div 3 + 1)
	copy cd to d
	set day of d to item 4 of w as integer
	set month of d to item 5 of w
	set year of d to (item 6 of w as integer) + 2000
	if d < cd then
		set target_dates to "Lost" & return & contents of this_p & return
	else
		set target_dates to target_dates & contents of this_p & return
		exit repeat
	end if
end repeat
target_dates

To output several shows to the same output so that I can cll this script from gektools I only get:

"Lost
 60.   3-12  14 Mar 07   Par Avion
 61.   3-13  21 Mar 07   The Man From Tallahassee
"

Okay, no problem either,
you can add further shows to ShowList (the spelling must be the same as in the URL)

property theMonths : "JanFebMarAprMayJunJulAugSepOctNovDec"
set ShowList to {"PrisonBreak", "Lost"}

set showDates to ""
repeat with i in ShowList
	set showDates to showDates & get_show_data(i)
end repeat
display dialog showDates

on get_show_data(show)
	set p_list to paragraphs of (do shell script "curl -s [url=http://epguides.com/]http://epguides.com/"[/url] & show & "/ | grep '^ *[0-9]' | sed -e 's/<[^>]*>//g' | tail -8 | cut -c1-13,28-")
	tell (current date) to set cd to it - (its time)
	repeat with this_p in p_list
		set w to words of this_p
		set item 5 of w to ((offset of (item 5 of w) in theMonths) div 3 + 1)
		copy cd to d
		set day of d to item 4 of w as integer
		set month of d to item 5 of w
		set year of d to (item 6 of w as integer) + 2000
		if d < cd then
			set target_dates to show & return & contents of this_p & return
		else
			set target_dates to target_dates & contents of this_p & return & return
			exit repeat
		end if
	end repeat
	return target_dates
end get_show_data