Memory problems (I think)

This is my first really large applescript and as I add more sites it has started crashing after the first few… have I reached a memory boundry or is there a more basic problem I am missing?


--==== Hexley's Headlines ====---
prompts me with my news and brings up the articles

--==== Hexley's Headlines orginization ====--
-- Classes
-- +Link
-- +Site
-- Singletons
-- +based on site:
-- ++Slashdot
-- Globals
-- run handler

--==== Classes ====--

(* Link *)

script Link
	
	property myHeadline : missing value
	property myURL : missing value
	
	on openPage()
		if characters 1 thru 7 of myURL = "http://" then
		else
			set myURL to "http://" & myURL
		end if
		open location myURL
	end openPage
	
end script

(* Site *)

script Site
	
	property cur_links : {}
	property old_links : {}
	property new_links : {}
	
	on update()
		copy new_links to old_links
		set new_links to {}
		set cur_links to {}
		getNewData()
		set new_links to result
		repeat with eachLink in new_links
			if (count of old_links) > 0 then
				if myHeadline of eachLink = myHeadline of the first item of old_links then
					exit repeat
				else
					copy eachLink to the end of cur_links
				end if
			else
				copy eachLink to the end of cur_links
			end if
		end repeat
	end update
	
	on getNewData()
		set endResult to {}
		do shell script myScript()
		set listing to the result
		set myLines to every paragraph of listing
		set myCount to count myLines
		repeat with index from 1 to myCount by 2
			copy Link to newLink
			set myURL of newLink to item index of myLines
			set myHeadline of newLink to item (index + 1) of myLines
			copy newLink to end of endResult
		end repeat
		return endResult
	end getNewData
	
	on prompt()
		repeat with eachLink in cur_links
			set thePrompt to my name & ¬
				": " & myHeadline of eachLink
			set button to first item of ((display dialog thePrompt buttons {"Open Link", "Skip Link", "Skip Site"}) as list)
			
			if button = "Open Link" then
				tell eachLink to openPage()
			else if button = "Skip Site" then
				exit repeat
			end if
		end repeat
	end prompt
	
end script

--==== Singletons ====--

script Macslash
	property parent : Site
	property name : "Macslash"
	
	on myScript()
		return "curl macslash.org/search.pl | " & ¬
			"grep 'article.pl' | " & ¬
			"sed 's/.*\/\/\(.*\)&amp.*>\(.*\)<\/A>.*/\1
\2/'"
	end myScript
	
end script

script Slashdot
	property parent : Site
	property name : "Slashdot"
	on myScript()
		return "curl slashdot.org/search.pl " & ¬
			"| grep 'article.pl' " & ¬
			"| sed 's/.*\/\/\(.*\)&mode.*>\(.*\)<\/A>/\1
\2/' "
	end myScript
	
end script

script Oreilly
	property parent : Site
	property name : "O'Reilly"
	
	on myScript()
		return "curl oreillynet.com/pub/q/all_mac_articles | " & ¬
			"grep '<p class="secondary">.*</b>' | " & ¬
			"sed 's/<p class="secondary"><b><a href="\(.*\)">\(.*\)<\/a><\/b>/oreillynet.com\1
\2/'"
	end myScript
	
end script


--==== Globals ====--

property Sites : {Macslash, Oreilly, Slashdot}

--==== Run Handler ====--

repeat with eachSite in Sites
	tell eachSite
		update()
		prompt()
	end tell
end repeat

thanks

I don’t know if this is the answer you want but I rejiggered your script (well, I rewrote it almost entirely) and created a headline HTML file to use as a browser (this saves the tedious dialog prompts). I also got rid of your script objects that may have contributed to your problems. Finally, I tweaked the grep statements and the link/headline parser which may also help you. Here is my version of the script that may give MacSurfer a run for their money:

Jon

well, the thing is… I am not planning on using it quite like this. This is the first part just to get the site/link/headline,url hierarchy going and then I was going to use it to make the dock menu for an application. The way it was divided earlier provided greater readability, maintainability, and extendability (OMG I am sounding like my comp-sci prof…). Anyways, is there any reason you can see for the crashing issues?

ps: another site (that needs a little debugging… :wink: )


script applescriptBBSOSX
	property parent : Site
	property name : "Applescript BBS: OS X forum"
	
	on myScript()
		return "curl bbs.applescript.net/viewforum.php?f=2 | " & ¬
			"grep '<a href=".*" class="topicTitle">.*</a></span>' | " & ¬
			"sed 's/<a href="\(.*\)" class="topicTitle">\(.*\)<\/a><\/span>/\1\n\2/'"
	end myScript
	
end script

I’m not sure it is a memory issue, I was having problems with the hard returns in your grep strings removing those and then delimiting using the space seems to work fine for me. As to the the other issues, I don’t see how yours is any more portable/maintainable/extendable. To add another site, just add another record to the_sites property at the top of my script. The update(the_site) routine updates each site per your specifications. Instead of building HTML, just add a routine for building your dock menu from the_site records (this way you’d have a choice).

Incidentally, I know you can build dynamic dock menu items in AS Studio (I assume you’re using AS Studio to develop this) but currently Apple claims there is no method for connecting the new menus to a script handler (this may change in XCode, I don’t have a copy). If you get that working I’d love to see how you did it.

Jon

I won’t pretend to grok these scripts and this is only offered as a FYI in case you aren’t aware of what’s available…

Many, if not all, of the sites in the scripts have XML feeds which might be easier to deal with. See that little orange XML icon at the bottom of this page? That’s the feed for this entire BBS. MacScripter.net has feeds for most of the other sections as well.

MacSlash: http://www.macslash.org/macslash.rdf
Slashdot: http://slashdot.org/slashdot.rss
O’Reilly’s MacDevCenter: http://www.oreillynet.com/meerkat/?&p=8113&_fl=rss

Late Night Software offers the XML Tools scripting addition to help parse and generate XML files.

– Rob

“curl “bbs.applescript.net/viewforum.php?f=2” 2>/dev/null|grep ‘.'|sed 's/.(.*)</a></span>/bbs.applescript.net/\1 \2/’”

just in case you were trying it out…

With regards to setting up the dynamic dock menu (I haven’t implemented it yet…but) here’s how I’m planning on implementing it: using the obj-c/AS bridge I will make an object that has a method like [dockMenuController doMenu:sender], and a method like [dockMenuController doMenuSelector] which returns @selector(doMenu:) or some such thing. When the dynamic menu sets up I can add a menu item with the selector for doMenu. The doMenu command gets a sender object reference passed to it by the cocoa framework when the menu is called, and uses NSMenuItem’s index and menu methods to figure out which link it represents. It can then call the applescript to open the proper link. I don’t forsee any major problems, but it does require more cocoa and obj-c use than the average ASStudio project. The key is that applescript can make obj-c calls and obj-c can execute applescripts.

Rob’s right that the RSS would be a better bet but here’s a grep string that seems to work for MacScripter:


I know you can call obj-C from AS (and vice versa) but it always confuses me. I don’t understand obj-C well enough to create my own methods. Again, if you get that part working, I’d love to see it.

Jon

I’m not sure if this applies here, but there is an interesting thread over at Mac OS X Hints about retrieving news feeds from some sites via scripts. Read through all the comments, there is a mention of the possibiltity of getting banned due to a bandwidth issue…

http://www.macosxhints.com/article.php?story=20030425000830370&query=news

Also see…

http://slashdot.org/faq/accounts.shtml#ac1050

Tweaking it a bit more, I think this is what you want:


Jon

I have cleaned up the code a little more… I tried re-writing the writeHtml handler too but something strange was going on… however, to keep things up to date, here’s the latest revision


--====Hexley Headlines====--

-- File Layout
-- + Globals
-- + Main hanlder
-- + Utility Functions

--====Globals====--

property the_sites : {¬
	{site_name:"Macslash", shell_script:"curl macslash.org/search.pl | grep 'article.pl' | sed 's/.*\/\/\(.*\)&.*>\(.*\)<\/A>.*/\1" & return & "\2/'", new_links:{}, old_links:{}}, ¬
	{site_name:"O'Reilly", shell_script:"curl oreillynet.com/pub/q/all_mac_articles | grep '<p class="secondary">.*</b>' | sed 's/<p class="secondary"><b><a href="\(.*\)">\(.*\)<\/a><\/b>/oreillynet.com\1" & return & "\2/'", new_links:{}, old_links:{}}, ¬
	{site_name:"Slashdot", shell_script:"curl slashdot.org/search.pl | grep 'article.pl' | sed 's/.*\/\/\(.*\)&mode.*>\(.*\)<\/A>/\1" & return & "\2/'", new_links:{}, old_links:{}}, ¬
	{site_name:"Applescript BBS: OS X Forum", shell_script:"curl "bbs.applescript.net/viewforum.php?f=2" 2>/dev/null|grep '<a href=".*" class="topictitle">.*</a></span>'|sed 's/.*<a href="\(.*\)" class="topictitle">\(.*\)<\/a><\/span>/bbs.applescript.net\/\1" & return & "\2/'", new_links:{}, old_links:{}}, ¬
	{site_name:"Applescript BBS: ASStudio", shell_script:"curl "bbs.applescript.net/viewforum.php?f=3" 2>/dev/null|grep '<a href=".*" class="topictitle">.*</a></span>'|sed 's/.*<a href="\(.*\)" class="topictitle">\(.*\)<\/a><\/span>/bbs.applescript.net\/\1" & return & "\2/'", new_links:{}, old_links:{}} ¬
		}

--====Main Handler====--

on run
	repeat with site in the_sites
		set site to update(site)
	end repeat
	
	set file_path to (((path to temporary items) as string) & "headlines.html")
	my write_to_file(file_path, generate_html(), false)
	set file_path to ("file://" & (POSIX path of file_path))
	tell application "Safari"
		activate
		open location file_path
	end tell
end run

--====Utility Functions====--

on update(the_site)
	set the_site's old_links to the_site's new_links
	set cur_links_raw to every paragraph of (do shell script (the_site's shell_script))
	set cur_links_parsed to {}
	repeat with index from 1 to (count of cur_links_raw) by 2
		set the_link to item index of cur_links_raw
		set the_headline to item (index + 1) of cur_links_raw
		set link_record to {the_link:the_link, the_headline:the_headline}
		copy link_record to end of cur_links_parsed
	end repeat
	set the_site's new_links to cur_links_parsed
	return the_site
end update

on generate_html()
	set html_string to {"<html><head><title>Headlines</title></head><body>" & return & "<h3><i>Last updated " & ((current date) as string) & "</i></h3>" & return}
	repeat with index from 1 to (count of the_sites)
		set the_site to (item index of the_sites)
		set new_links to the_site's new_links
		if new_links = the_site's old_links then
			set update_string to "<i>No new headlines.</i>" & return
		else
			set update_string to "<i>New Headlines!</i>"
		end if
		copy ("<h2>Headlines from " & (the_site's site_name) & ":</h2>" & return & update_string & "<ul>" & return) to end of html_string
		repeat with k from 1 to (count of new_links)
			set the_link to (the_link of (item k of new_links))
			if the (count of the_site's old_links) is greater than 0 then ¬
				if the_link = the first item of the the_site's old_links then ¬
					exit repeat
			if the_link does not start with "http://" then set the_link to "http://" & the_link
			copy (tab & "<li><a href="" & the_link & "" target="_blank">" & (the_headline of (item k of new_links)) & "</a></li>" & return) to end of html_string
		end repeat
		copy ("</ul>" & return) to end of html_string
	end repeat
	copy "</body></html>" to end of html_string
	return html_string as string
end generate_html

on write_to_file(the_file, the_string, appending)
	set the_file to the_file as string
	try
		set write_file to open for access file the_file with write permission
		if appending = false then set eof of write_file to 0
		write the_string to write_file starting at eof
		close access write_file
	on error
		try
			close access write_file
		end try
	end try
end write_to_file

With regards to being banned from slashdot, I think they are referring to someone with a cron job to refresh every second… used wisely this script shouldn’t face any problems

you changed it to spaces beacuse of the hard returns giving you problems… using seperate lines is more elegant (IMHO) but I forgot about the return literal and was typing r and having SE turn it into a hard return automatically…

You’re right that the space was not an elegant delimiter so now I’ve changed it to a tab and it is much cleaner. I worked it up to a full app that also is smarter about caching HTML if the links haven’t changed. I’ve given this thread (and you, mooresan, specifically) credit. It is editable freeware. For more information and to download it, see:

http://homepage.mac.com/jonn8/as/#Jons_Headlines

Jon

I count it as a team effort… thanks for your contributions as well…

I will give you some credit in my ASStudio extension that I mentioned…

Beaten to the punch?

http://sourceforge.net/projects/iconaradom/

Jon