I am writing this script as part of a larger project and I can’t seem to understand an issue I am having with it:
set rsslist to {"http://www.nytimes.com/services/xml/rss/nyt/Magazine.xml"}
repeat with r from 1 to count of every item of rsslist
set rsscontent to do shell script "curl " & item r of rsslist
set contentlist to parsecode(rsscontent, "link") as list
repeat with c from 1 to count of every item of contentlist
set thisurl to item c of contentlist
set thiscontent to do shell script "curl " & thisurl as text
return thiscontent
end repeat
end repeat
on parsecode(code, tag)
set opentag to "<" & tag & ">"
set closetag to "</" & tag & ">"
set itemlist to {}
set AppleScript's text item delimiters to opentag
set taglist to every text item of code as list
set childtaglist to {}
repeat with x from 2 to count of every item of taglist
copy item x of taglist to end of childtaglist
end repeat
repeat with thisitem in childtaglist
set AppleScript's text item delimiters to closetag
copy text item 1 of thisitem to end of itemlist
set AppleScript's text item delimiters to opentag
end repeat
return itemlist
end parsecode
In this format it works fine. When you return thiscontent it is indeed the content of the url. However, the first link in most of the pages I have returned is just the rss link I originally parsed. So I want to skip it.
When I rewrite the script like below, adding a conditional to get the value of c and ignore portion of the script if it is 2 (for some reason I get the same link twice at the start of the NYTMag feed) thiscontent returns with no value. Without variables, if I hard code the url, it works fine. (Like this: set thiscontent to do shell script “curl http://www.nytimes.com/2008/12/07/magazine/07cuba-t.html?partner=rss&emc=rss”) Is there some formatting issue I am missing? I’ve tried setting thisurl to string and text but same results.
set rsslist to {"http://www.nytimes.com/services/xml/rss/nyt/Magazine.xml"}
repeat with r from 1 to count of every item of rsslist
set rsscontent to do shell script "curl " & item r of rsslist
set contentlist to parsecode(rsscontent, "link") as list
repeat with c from 1 to count of every item of contentlist
if c > 1 then
set thisurl to item c of contentlist
set thiscontent to do shell script "curl " & thisurl as text
return thiscontent
end if
end repeat
end repeat
on parsecode(code, tag)
set opentag to "<" & tag & ">"
set closetag to "</" & tag & ">"
set itemlist to {}
set AppleScript's text item delimiters to opentag
set taglist to every text item of code as list
set childtaglist to {}
repeat with x from 2 to count of every item of taglist
copy item x of taglist to end of childtaglist
end repeat
repeat with thisitem in childtaglist
set AppleScript's text item delimiters to closetag
copy text item 1 of thisitem to end of itemlist
set AppleScript's text item delimiters to opentag
end repeat
return itemlist
end parsecode