This is my first really large applescript and as I add more sites it has started crashing after the first few… have I reached a memory boundry or is there a more basic problem I am missing?
--==== Hexley's Headlines ====---
prompts me with my news and brings up the articles
--==== Hexley's Headlines orginization ====--
-- Classes
-- +Link
-- +Site
-- Singletons
-- +based on site:
-- ++Slashdot
-- Globals
-- run handler
--==== Classes ====--
(* Link *)
script Link
property myHeadline : missing value
property myURL : missing value
on openPage()
if characters 1 thru 7 of myURL = "http://" then
else
set myURL to "http://" & myURL
end if
open location myURL
end openPage
end script
(* Site *)
script Site
property cur_links : {}
property old_links : {}
property new_links : {}
on update()
copy new_links to old_links
set new_links to {}
set cur_links to {}
getNewData()
set new_links to result
repeat with eachLink in new_links
if (count of old_links) > 0 then
if myHeadline of eachLink = myHeadline of the first item of old_links then
exit repeat
else
copy eachLink to the end of cur_links
end if
else
copy eachLink to the end of cur_links
end if
end repeat
end update
on getNewData()
set endResult to {}
do shell script myScript()
set listing to the result
set myLines to every paragraph of listing
set myCount to count myLines
repeat with index from 1 to myCount by 2
copy Link to newLink
set myURL of newLink to item index of myLines
set myHeadline of newLink to item (index + 1) of myLines
copy newLink to end of endResult
end repeat
return endResult
end getNewData
on prompt()
repeat with eachLink in cur_links
set thePrompt to my name & ¬
": " & myHeadline of eachLink
set button to first item of ((display dialog thePrompt buttons {"Open Link", "Skip Link", "Skip Site"}) as list)
if button = "Open Link" then
tell eachLink to openPage()
else if button = "Skip Site" then
exit repeat
end if
end repeat
end prompt
end script
--==== Singletons ====--
script Macslash
property parent : Site
property name : "Macslash"
on myScript()
return "curl macslash.org/search.pl | " & ¬
"grep 'article.pl' | " & ¬
"sed 's/.*\/\/\(.*\)&.*>\(.*\)<\/A>.*/\1
\2/'"
end myScript
end script
script Slashdot
property parent : Site
property name : "Slashdot"
on myScript()
return "curl slashdot.org/search.pl " & ¬
"| grep 'article.pl' " & ¬
"| sed 's/.*\/\/\(.*\)&mode.*>\(.*\)<\/A>/\1
\2/' "
end myScript
end script
script Oreilly
property parent : Site
property name : "O'Reilly"
on myScript()
return "curl oreillynet.com/pub/q/all_mac_articles | " & ¬
"grep '<p class="secondary">.*</b>' | " & ¬
"sed 's/<p class="secondary"><b><a href="\(.*\)">\(.*\)<\/a><\/b>/oreillynet.com\1
\2/'"
end myScript
end script
--==== Globals ====--
property Sites : {Macslash, Oreilly, Slashdot}
--==== Run Handler ====--
repeat with eachSite in Sites
tell eachSite
update()
prompt()
end tell
end repeat
I don’t know if this is the answer you want but I rejiggered your script (well, I rewrote it almost entirely) and created a headline HTML file to use as a browser (this saves the tedious dialog prompts). I also got rid of your script objects that may have contributed to your problems. Finally, I tweaked the grep statements and the link/headline parser which may also help you. Here is my version of the script that may give MacSurfer a run for their money:
well, the thing is… I am not planning on using it quite like this. This is the first part just to get the site/link/headline,url hierarchy going and then I was going to use it to make the dock menu for an application. The way it was divided earlier provided greater readability, maintainability, and extendability (OMG I am sounding like my comp-sci prof…). Anyways, is there any reason you can see for the crashing issues?
ps: another site (that needs a little debugging… )
script applescriptBBSOSX
property parent : Site
property name : "Applescript BBS: OS X forum"
on myScript()
return "curl bbs.applescript.net/viewforum.php?f=2 | " & ¬
"grep '<a href=".*" class="topicTitle">.*</a></span>' | " & ¬
"sed 's/<a href="\(.*\)" class="topicTitle">\(.*\)<\/a><\/span>/\1\n\2/'"
end myScript
end script
I’m not sure it is a memory issue, I was having problems with the hard returns in your grep strings removing those and then delimiting using the space seems to work fine for me. As to the the other issues, I don’t see how yours is any more portable/maintainable/extendable. To add another site, just add another record to the_sites property at the top of my script. The update(the_site) routine updates each site per your specifications. Instead of building HTML, just add a routine for building your dock menu from the_site records (this way you’d have a choice).
Incidentally, I know you can build dynamic dock menu items in AS Studio (I assume you’re using AS Studio to develop this) but currently Apple claims there is no method for connecting the new menus to a script handler (this may change in XCode, I don’t have a copy). If you get that working I’d love to see how you did it.
I won’t pretend to grok these scripts and this is only offered as a FYI in case you aren’t aware of what’s available…
Many, if not all, of the sites in the scripts have XML feeds which might be easier to deal with. See that little orange XML icon at the bottom of this page? That’s the feed for this entire BBS. MacScripter.net has feeds for most of the other sections as well.
With regards to setting up the dynamic dock menu (I haven’t implemented it yet…but) here’s how I’m planning on implementing it: using the obj-c/AS bridge I will make an object that has a method like [dockMenuController doMenu:sender], and a method like [dockMenuController doMenuSelector] which returns @selector(doMenu:) or some such thing. When the dynamic menu sets up I can add a menu item with the selector for doMenu. The doMenu command gets a sender object reference passed to it by the cocoa framework when the menu is called, and uses NSMenuItem’s index and menu methods to figure out which link it represents. It can then call the applescript to open the proper link. I don’t forsee any major problems, but it does require more cocoa and obj-c use than the average ASStudio project. The key is that applescript can make obj-c calls and obj-c can execute applescripts.
Rob’s right that the RSS would be a better bet but here’s a grep string that seems to work for MacScripter:
I know you can call obj-C from AS (and vice versa) but it always confuses me. I don’t understand obj-C well enough to create my own methods. Again, if you get that part working, I’d love to see it.
I’m not sure if this applies here, but there is an interesting thread over at Mac OS X Hints about retrieving news feeds from some sites via scripts. Read through all the comments, there is a mention of the possibiltity of getting banned due to a bandwidth issue…
I have cleaned up the code a little more… I tried re-writing the writeHtml handler too but something strange was going on… however, to keep things up to date, here’s the latest revision
--====Hexley Headlines====--
-- File Layout
-- + Globals
-- + Main hanlder
-- + Utility Functions
--====Globals====--
property the_sites : {¬
{site_name:"Macslash", shell_script:"curl macslash.org/search.pl | grep 'article.pl' | sed 's/.*\/\/\(.*\)&.*>\(.*\)<\/A>.*/\1" & return & "\2/'", new_links:{}, old_links:{}}, ¬
{site_name:"O'Reilly", shell_script:"curl oreillynet.com/pub/q/all_mac_articles | grep '<p class="secondary">.*</b>' | sed 's/<p class="secondary"><b><a href="\(.*\)">\(.*\)<\/a><\/b>/oreillynet.com\1" & return & "\2/'", new_links:{}, old_links:{}}, ¬
{site_name:"Slashdot", shell_script:"curl slashdot.org/search.pl | grep 'article.pl' | sed 's/.*\/\/\(.*\)&mode.*>\(.*\)<\/A>/\1" & return & "\2/'", new_links:{}, old_links:{}}, ¬
{site_name:"Applescript BBS: OS X Forum", shell_script:"curl "bbs.applescript.net/viewforum.php?f=2" 2>/dev/null|grep '<a href=".*" class="topictitle">.*</a></span>'|sed 's/.*<a href="\(.*\)" class="topictitle">\(.*\)<\/a><\/span>/bbs.applescript.net\/\1" & return & "\2/'", new_links:{}, old_links:{}}, ¬
{site_name:"Applescript BBS: ASStudio", shell_script:"curl "bbs.applescript.net/viewforum.php?f=3" 2>/dev/null|grep '<a href=".*" class="topictitle">.*</a></span>'|sed 's/.*<a href="\(.*\)" class="topictitle">\(.*\)<\/a><\/span>/bbs.applescript.net\/\1" & return & "\2/'", new_links:{}, old_links:{}} ¬
}
--====Main Handler====--
on run
repeat with site in the_sites
set site to update(site)
end repeat
set file_path to (((path to temporary items) as string) & "headlines.html")
my write_to_file(file_path, generate_html(), false)
set file_path to ("file://" & (POSIX path of file_path))
tell application "Safari"
activate
open location file_path
end tell
end run
--====Utility Functions====--
on update(the_site)
set the_site's old_links to the_site's new_links
set cur_links_raw to every paragraph of (do shell script (the_site's shell_script))
set cur_links_parsed to {}
repeat with index from 1 to (count of cur_links_raw) by 2
set the_link to item index of cur_links_raw
set the_headline to item (index + 1) of cur_links_raw
set link_record to {the_link:the_link, the_headline:the_headline}
copy link_record to end of cur_links_parsed
end repeat
set the_site's new_links to cur_links_parsed
return the_site
end update
on generate_html()
set html_string to {"<html><head><title>Headlines</title></head><body>" & return & "<h3><i>Last updated " & ((current date) as string) & "</i></h3>" & return}
repeat with index from 1 to (count of the_sites)
set the_site to (item index of the_sites)
set new_links to the_site's new_links
if new_links = the_site's old_links then
set update_string to "<i>No new headlines.</i>" & return
else
set update_string to "<i>New Headlines!</i>"
end if
copy ("<h2>Headlines from " & (the_site's site_name) & ":</h2>" & return & update_string & "<ul>" & return) to end of html_string
repeat with k from 1 to (count of new_links)
set the_link to (the_link of (item k of new_links))
if the (count of the_site's old_links) is greater than 0 then ¬
if the_link = the first item of the the_site's old_links then ¬
exit repeat
if the_link does not start with "http://" then set the_link to "http://" & the_link
copy (tab & "<li><a href="" & the_link & "" target="_blank">" & (the_headline of (item k of new_links)) & "</a></li>" & return) to end of html_string
end repeat
copy ("</ul>" & return) to end of html_string
end repeat
copy "</body></html>" to end of html_string
return html_string as string
end generate_html
on write_to_file(the_file, the_string, appending)
set the_file to the_file as string
try
set write_file to open for access file the_file with write permission
if appending = false then set eof of write_file to 0
write the_string to write_file starting at eof
close access write_file
on error
try
close access write_file
end try
end try
end write_to_file
With regards to being banned from slashdot, I think they are referring to someone with a cron job to refresh every second… used wisely this script shouldn’t face any problems
you changed it to spaces beacuse of the hard returns giving you problems… using seperate lines is more elegant (IMHO) but I forgot about the return literal and was typing r and having SE turn it into a hard return automatically…
You’re right that the space was not an elegant delimiter so now I’ve changed it to a tab and it is much cleaner. I worked it up to a full app that also is smarter about caching HTML if the links haven’t changed. I’ve given this thread (and you, mooresan, specifically) credit. It is editable freeware. For more information and to download it, see: