set p_list to paragraphs of (do shell script "curl -s http://epguides.com/24/ | grep '^[ ]*[0-9]\+\.' | sed -e 's/<[^>]*>//g' | tail -12 | cut -c1-13,28-")
To work properly.
I get an error saying that the + in the grep part is unexpected.
Anyone?
Aha!
I need to match lines starting with a space, and a number with 2 or 3 digets followed by a priod.
Like " 34." or " 141."
How do I do this without using extended reg exp?
When you use "" to escape in a do shell script, you need to double it “\”. You’re escaping the escape for AppleScript so it takes the escape literally in the shell script.
set regexp to "^ [ ]*[^0-9]\\{0,\\}[0-9]\\{2,3\\}[^0-9]\\{0,\\}\\."
do shell script "sed -n '/" & regexp & "/ p' <~/test.txt"
→ " 123.
→ 12. hij"
That’s the result of this text in text file test.txt:
hij
Here I used input from test.txt. sed searches for lines that matches the regular expression and prints them. You see the \{2,3\}, that says a minimum of 2 and maximum of 3 digits, but you have to set the lower and upper limits as 0 or more non digits.
Edited: the spaces didn’t show up in the post, so i replaced them with dashes:
A quick look at the page looks like you have it about right, although I don’t know what you’re trying to get to. You know that grep is basic regex without the -E. And you just need to double up on the \. Don’t use my example because I haven’t studied the page you’re searching for something. I was mainly trying to show you how you need to use double \ and looking at the page, it likes like you don’t need to use the 2 to three digits you mentioned anyway. Keep at it.
property theMonths : "JanFebMarAprMayJunJulAugSepOctNovDec"
set ShowList to {"PrisonBreak", "HowIMetYourMother", "Lost", "Heroes", "Medium", "GreysAnatomy", "24", "BattlestarGalactica"}
set showDates to ""
repeat with i in ShowList
set showDates to showDates & get_show_data(i)
end repeat
-- display dialog showDates
on get_show_data(show)
set p_list to paragraphs of (do shell script "curl -s [url=http://epguides.com/]http://epguides.com/"[/url] & show & "/ | grep '^ [0-9][0-9]' | sed -e 's/<[^>]*>//g' | tail -12 | cut -c1-13,28-")
tell (current date) to set cd to it - (its time)
set target_dates to show & return
repeat with this_p in p_list
set w to words of this_p
set item 5 of w to ((offset of (item 5 of w) in theMonths) div 3 + 1)
copy cd to d
set day of d to item 4 of w as integer
set month of d to item 5 of w
set year of d to (item 6 of w as integer) + 2000
if d < cd then
set target_dates to show & return & " " & contents of this_p & return
else
set target_dates to target_dates & "*" & contents of this_p & return & return
exit repeat
end if
end repeat
return target_dates
end get_show_data
Stefan made it.
I’m trying to get the previouse ad next air time of my tv shows.
But 24 has over a 100 episodes, so the script fails when it tries to check that one. the outputed episode isn’t the right one.