Help! I can't find the bug!

Hi, y’all!

I’m not exactly “new” to Applescript, but I’ve only recently started trying to make it jump through hoops. Frankly I never had a need for it until this project. I’ve done some work with HyperCard and SuperCard in the past, but I’m really not that great a programmer. I found this forum via Google and am hoping y’all can help. I’m stumped!

I’m trying to put together a script that will automatically pull obituary, wedding, and engagement announcements from an archive and post them on a website. The script would seem to be pretty straightforward, but I’m having trouble with it. There’s a bug in it that causes the announcements to be copied in repeatedly. I can’t figure out why. I have this terrible feeling that my algorithm is flawed. So far I’ve only coded for the obits – since that’s not working right, I see no reason to get the other announcements working yet. The archive files are expected to be either Quark 4.1 files or plain text.

The process SHOULD be this: Find the correct archive folders on the server, get the announcements using a filename flag, open the announcement to extract the text, format the text into a HTML file, then upload that file to the server.

One problem may be platform related. The files (and script) are created under OS 9, but our server is running OS X. It doesn’t seem to matter if I run the script on 9 or X, though, the same problem occurs.

You can see the output of the script at http://www.heraldindependent.com/html/obits.html.

Here’s the actual script. I’ve pulled out the part that sends the files to the server for security. I mostly want to fix the bug, but if anybody sees another improvement please let me know. The whole thing isn’t appearing when I preview this message, I hope y’all get the full script.


-- This script gets obits from the archive files and pastes them into the appropriate master web page in HTML format. ******

-- First initialize some variables
set theDate to the current date -- this is the current date
set theText to ""

-- *********** This section of the code establishes file paths and gets the header and footer data *******************

-- Set up various variables
set folderPath to "WinVault:News Copy: CURRENT ISSUE PAGES:PAGES PRINTED" -- This sets the directory for the pages.
set workFilePath to "Data:The Herald Independent:" -- File path for local work files
set theFolders to recentPaperDates() -- Gets a list of folders to look in for announcements
set theTopFile to workFilePath & "stdtop.txt" -- Page header
set theBotFile to workFilePath & "stdbot.txt" -- Page footer
set theObitFile to workFilePath & "obits.html" -- Obits file
set theEngFile to workFilePath & "engagements.html" -- Engagements file
set theWedFile to workFilePath & "weddings.html" -- Weddings file

-- Get the file text into variables
open for access alias theTopFile --Open the top page template
set theTopText to (read alias theTopFile as text from 1 to (get eof alias theTopFile)) -- read the file into a variable
close access alias theTopFile -- Close the top page template

open for access alias theBotFile -- Open the bottom page template
set theBotText to (read alias theBotFile as text from 1 to (get eof alias theBotFile)) -- read the file into a variable
close access alias theBotFile -- close the bottom page template

-- ********** This section creates the obits file ********************************************************

set obitHeader to "<CENTER><H1>Obituaries</H1></CENTER><P><HR><P>" -- create Obits heading
set engHeader to "<CENTER><H1>Engagements</H1></CENTER><P><HR><P>" -- create Engagements heading
set wedHeader to "<CENTER><H1>Weddings</H1></CENTER><P><HR><P>" -- create Weddings heading

set theObits to obitHeader & getAnnouncements(theFolders, "obit", folderPath) -- assemble the obits into one variable
set theEng to engHeader & getAnnouncements(theFolders, "eng", folderPath) -- assemble the engagements into one variable
set theWed to wedHeader & getAnnouncements(theFolders, "wed", folderPath) -- assemble the weddings into one variable

copy theObits & return & "<hr>" & return & "Last Update: " & theDate & " - " & (current date) & "<P>" & return & return to theObitCode -- Add a "last updated" date stamp to the obits
copy theEng & return & "<hr>" & return & "Last Update: " & theDate & " - " & (current date) & "<P>" & return & return to theEngCode -- Add a "last updated" date stamp to the engagements
copy theWed & return & "<hr>" & return & "Last Update: " & theDate & " - " & (current date) & "<P>" & return & return to theWedCode -- Add a "last updated" date stamp to the weddings

set theNewCode to theTopText & return & theObitCode & return & theBotText -- Assemble the three pieces into a new variable

tell application "Finder"
	if (exists file theObitFile) then delete file theObitFile -- Erase the old file
end tell

--Write the new code to the obit file
tell application "Finder" to make file at workFilePath with properties {name:"obits.html", creator type:"ttxt", file type:"TEXT"}
open for access alias theObitFile with write permission -- create the file
write theNewCode to alias theObitFile -- write to the file
close access alias theObitFile -- close the file

-- Now send the obits file to the Web server. Deleted for security reasons.

-- The above section will need to be rewritten to update the weddings and engagements, too
-- Classifieds use a different file structure so they'll need their own script.

-- ********** This section of the script defines our function handlers ****************************************

on getAnnouncements(theFolders, announceKey, folderPath)
	
	-- This function extracts the text from Quark and text files and assembles it into one big text chunk.
	-- Parameters:
	-- theFolders       - A list of the folders containing the announcements we want to run
	-- announceKey  - The keyword in the filename for the announcement type
	-- folderPath       - The path to the files
	
	-- Initialize some variables
	set returnValue to "" -- the return variable
	set theText to "" -- holds the announcements
	set theCode to "" -- holds the announcements after they've been formatted in HTML
	
	repeat with thePaper from 1 to (the number of words in theFolders) -- step through the paper dates
		
		-- Drill down to the " Gone Copy" folder. This expects a specific file structure and will fail if the structure is changed.
		copy folderPath & ":" & PaperYear(word thePaper of theFolders) & ":" & (word thePaper of theFolders) & ": gone copy" to announceFolderPath
		
		-- put the names of all the files in the "gone copy" folder into a variable
		copy (list folder announceFolderPath) to fileList
		
		repeat with theItem from 1 to (the number of items in fileList) --this repeat checks to see if we've got an announcement of the correct type
			
			set currentFile to announceFolderPath & ":" & (item theItem of fileList) -- assemble the file path
			
			if currentFile contains announceKey then -- yep, it's an announcement of the correct type
				
				tell application "Finder"
					set fileType to (file type of file currentFile) -- get the file type
				end tell
				
				if fileType is "XDOC" then -- it's a Quark file
					
					tell application "QuarkXPressâ„¢" -- this tell will extract the text of the obit
						activate
						open file (announceFolderPath & ":" & (item theItem of fileList)) -- open this particular announcement
						
						try -- it probably doesn't need to be this complicated but I copied this block from someone else's script
							repeat with theStory from 1 to count of stories in document 1
								if exists word 1 of story theStory of document 1 then
									copy the text of story theStory of document 1 to theText -- dump the text into a variable
								end if
							end repeat
						on error errMsg number errnum
							-- need to beware of problems here (disk full, ???)
							display dialog (errMsg & return & return & errnum) buttons {"Oops!"} default button 1 with icon 0
							return
						end try
						
						close document 1 -- close the obit file
					end tell -- end of getting a Quark file
					
				else -- it's not a Quark file, so...
					if fileType is "TEXT" then -- it's a text file
						open for access file currentFile -- open the file
						set theText to (read file currentFile as text from 1 to (get eof file currentFile))
						close access file currentFile
					else -- Error! It's neither Quark or text!
						set theText to ""
					end if --end of getting a text file
				end if -- end of the file type check
				
				-- Now format the text in HTML in a new variable theCode, which only means adding "<P>" tags at the graph breaks
				repeat with theGraph from 1 to (the number of paragraphs in theText)
					copy theCode & (paragraph theGraph of theText) & "<P>" & return to theCode
				end repeat
				
				copy (theCode & return & "<HR>" & return) to theCode -- get all the announcements together with a standard separator between them
				
			end if -- end of the check for the announcement keyword check
			
		end repeat -- end of stepping through the file list
		
		copy returnValue & theCode to returnValue -- Now put the announcement at the end of any previous announcements
		
	end repeat -- end of stepping through the paper dates
	
	return returnValue -- sends the announcement text chunk
	
end getAnnouncements

on recentPaperDates()
	-- this handler figures out the dates of the last month of papers
	--and returns them in a list.
	
	-- initialize some empty global variables
	global gPaperList
	copy "" to gPaperList
	
	-- Get the date range
	global gTodaysDate
	set gTodaysDate to the current date -- today
	global gStartDate
	set gStartDate to (the (current date) - (14 * days)) -- subtract one week
	
	-- Extract the paper dates
	set gTheDate to gStartDate -- initialize our counter
	
	repeat while gTheDate ≠ gTodaysDate
		if the weekday of gTheDate is Tuesday then -- it's Tuesday
			set gPaperList to (gPaperList & numberDate(gTheDate) & ", ") -- add this date to the paper list
		else
			if the weekday of gTheDate is Friday then -- it's Friday
				set gPaperList to (gPaperList & numberDate(gTheDate) & ", ") --add this date to the paper list
			end if
		end if
		
		set gTheDate to (gTheDate + (1 * days)) -- increment the counter
	end repeat
	
	return gPaperList
	
end recentPaperDates

on numberDate(theDate)
	-- this little function converts the date to the six-digit number used in the HI archive
	
	--get the month into numerical format (@#$% applescript!)
	if the month of theDate is January then set theMonth to "01" as string
	if the month of theDate is February then set theMonth to "02" as string
	if the month of theDate is March then set theMonth to "03" as string
	if the month of theDate is April then set theMonth to "04" as string
	if the month of theDate is May then set theMonth to "05" as string
	if the month of theDate is June then set theMonth to "06 as string"
	if the month of theDate is July then set theMonth to "07" as string
	if the month of theDate is August then set theMonth to "08" as string
	if the month of theDate is September then set theMonth to "09" as string
	if the month of theDate is October then set theMonth to "10" as string
	if the month of theDate is November then set theMonth to "11" as string
	if the month of theDate is December then set theMonth to "12" as string
	
	set theDay to (the day of theDate as string) -- get the day into a two-digit number as a string
	if the number of characters in theDay < 2 then set theDay to "0" & theDay
	
	set theyear to (characters 3 thru 4 of (the year of theDate as string)) -- extract the year
	
	set theReturnValue to theMonth & theDay & theyear
	return theReturnValue
	
end numberDate

on PaperYear(theFolderName)
	-- this function extracts the year from the six-digit flder name
	
	set theyear to "20" & (characters 5 thru 6 of theFolderName)
	
	return theyear
	
end PaperYear

on updateObitCode(theObitCode, theObitPage)
	-- this function assembles the code from the HTML file
	
	open for access theObitPage with write permission
	set theText to (read alias theObitPage as text from 1 to (get eof alias theObitPage)) as text
	
	-- Step through the lines of the file looking for the trigger lines
	repeat with theLine from 1 to (the words of theText)
		
		if line theLine of theText contains "StartScriptSource" then
			copy theLine to topCodeLineDiv -- identifies the bottom of the top
		end if
		
		if line theLine of theText contains "EndScriptSource" then
			copy theLine to botCodeLineDiv -- identifies the top of the bottom
		end if
		
	end repeat
	
	-- Now actually separate stuff into three variables
	repeat with theLine from 1 to (the words of theCode) -- actually do the separation
		
		if theLine ≤ topCodeLineDiv then -- we're still in the top part of the file
			copy theTopCode & (line theLine of theCode) & " " to theTopCode
		end if
		
		if (theLine > topCodeLineDiv) and (theLine < botCodeLineDiv) then
			-- we're in the part that will change
			next
		end if
		
		if theLine ≥ botCodeLineDiv then -- we're now in the bottom part of the file
			copy theBotCode & (line theLine of theCode) & " " to theBotCode
		end if
		
	end repeat
	
	-- Now reassemble the three variables into a single block of HTML code
	copy theTopCode & return & theObitCode & return & theBotCode to theNewCode
	
	write theNewCode to alias theObitPage as text
	close access alias theObitPage
	
	return "true"
	
end updateObitCode

Model: Power Mac G4
AppleScript: 1.8.3
Browser: Netscape/7.02
Operating System: Mac OS 9.2.x

The problem we face with your query is that none of us have AS 1.8.3 to test it on, and most late-model machines won’t support OS 9.2.2 either.

Hi, scrmcurator.

I can’t see what’s causing the repeated copies, but there’s quite a lot that’s either wrong or suboptimal in the script. To pick out some of the fatally wrong stuff:

There’s a typo in the numberDate() handler:

if the month of theDate is June then set theMonth to "06 as string"

-- Should be:
if the month of theDate is June then set theMonth to "06" as string

But the ‘as string’ isn’t actually necessary (“06” is already a string) and the handler itself could be reduced to:

on numberDate(thedate)
	set {day:d, year:y} to thedate
	copy thedate to b
	set b's month to January
	set m to (b - 2500000 - thedate) div -2500000
	
	return text 2 thru 7 of ((1000000 + m * 10000 + d * 100 + y mod 100) as string)
end numberDate

In the updateObitCode() handler

The ‘words of theText’ is a list, not a number. To judge from what follows, I’d guess you probably want:

repeat with theLine from 1 to (count theText's paragraphs)

Then:

‘line’ isn’t usable in vanilla AppleScript. You probably want ‘paragraph’.

See above regarding the repeat. Also, the variable ‘theCode’ hasn’t actually been set at that point.

I hope this helps you get started with the repairs.

I understand most of what you’re saying about my other mistakes, but the above code confuses me. I’m not the most competent programmer as my code pretty clearly shows. Can you explain how the code above works? Obviously it’s way more efficient than my function, but it’s more efficient to the point that this artist trying to be a scripter can’t follow it!

Specifically, the assignment statement with the curly braces is completely new to me. It looks like that’s setting the day and year part of the current date? The equation in the return is also confusing me – can you break down what that’s doing?

Hi. Apologies for the delayed reply. I’ve been away this weekend.

Apologies too, I suppose, for my unexplained code, which, although amazingly good, might appear abstruse even to experienced AppleScripters unless they’re familiar with the particular techniques. :slight_smile:

AppleScript has two methods, which use either a list or a record, for setting several variables in one line. They’re a little inefficient in execution terms, but can sometimes tidy up the way the script looks on screen. They’re particularly nice if you already have a list or a record whose contents you want to assign to separate variables:

-- With a list:
set myList to {1, 2, 3}

-- Conventional longhand:
set a to item 1 of myList
set b to item 2 of myList
set c to item 3 of myList

-- Less efficient, but neater and less to type:
set {a, b, c} to myList
--> The 'result' of the line is {1, 2, 3}
--> but individually, a = 1, b = 2, c = 3.

If there are more variables to set than there are items in the source list, you get an error. But if there are more items than variables, the later ones are simply ignored.

set myList to {1, 2, 3}

set {a, b} to myList
--> The 'result' of the line is {1, 2, 3}
--> but individually, a = 1, b = 2.

The method that foxed you involved a record. AppleScript’s ‘date’ class is related to its ‘record’ class. (Dates are sometimes called “date records” in these discussions.) Dates can’t be manipulated in all the same ways as records (nor vice versa), but their property values can be accessed and changed similarly.

-- Record:
set myRecord to {x:1, y:2, z:3}

-- Conventional longhand:
set a to x of myRecord
set b to y of myRecord
set c to z of myRecord

-- Slightly less efficient, but shorter:
set {x:a, y:b, z:c} to myRecord
--> The 'result' of the line is {x:1, y:2, z:3}
--> but individually, a = 1, b = 2, c = 3.


-- Date:
set myDate to (current date)

-- Conventional longhand:
set d to day of myDate
set m to month of myDate
set y to year of myDate

-- Slightly less efficient, but shorter:
set {day:d, month:m, year:y} to myDate
--> The 'result' of the line is date "Monday 1 May 2006 11:29:55" (or whatever)
--> but individually, d = 1, m = May, y = 2006.

That hopefully explains the ‘set {day:d, year:y} to thedate’ line. It sets ‘d’ to the day of ‘thedate’ and ‘y’ to the year of ‘thedate’.

The next three lines calculate the month number. Month-to-integer coercions weren’t introduced until Mac OS X 10.3 (Panther). Before that, the best way to get a month number from a date was:

This is a later development (by Richard Hartman and then by me) of Emmanuel Levy’s famous “French Vanilla” technique. The date ‘b’ has the same year, day, and time as ‘thedate’, but it’s month is January. (It doesn’t matter if thedate’s month is January too.) The difference between them is the same as the number of seconds from the beginning of the year to the beginning of thedate’s month. Adding a suitable number of seconds (in the range from 28 days to about 29.5 days) to this and then divving the result by the same number gives the number of thedate’s month. 2500000 is a conveniently round number in the range that works. The double negative in my implementation avoids a long-standing, but rarely encountered AppleScript date bug that’s only recently been fixed.

The last line is an efficient way to return the mmddyy string you want. If m = 5, d = 1, and y = 2006, the maths produces the number 1050106, which is coerced to the string “1050106”, and then the substring from the second to seventh characters is extracted from this. The leading zeros appear incidentally during the process without the need for individual string coercions and concatenations.

A very simple solution:

return do shell script "date +%Y-%m-%d"

(See “man date” in the terminal for different formattings.)

This delivers a date string in German format, just as an example:

return do shell script "date '+%d.%m.%Y - %H:%M:%S Uhr'"

You may define your own date string format that way! (OS X only)

And very much slower. :wink:

After a rather hectic month I’ve finally had a chance to get back to this project. There’s nothing like being away from something for a while to give you a fresh perspective! I’ve fixed a few problems but the basic repetition problem is still stumping me.

One thing I saw when I went back into the script is that the UpdateObitCode handler is a redundant leftover from a previous iteration of the script. I’ve deleted it and fixed the other things mentioned. Now only SOME of the obits are repeating instead of ALL of them. Here’s the output: http://www.heraldindependent.com/html/obits.html

A check of the Event Log shows that the obit files are only being read once, so the problem must be with the part of the script that makes them into HTML. The script output has some extra horizontal rules in it where it quits repeating. Could this be a clue to what’s wrong?

BTW, I’m now running in OS X 10.3.9 with Applescript 1.9.3. However, the script has to drop to Classic (OS 9.2.2) to pull the Quark files since they’re generated in Quark 4.1 (I know, but I’m stuck with this for a while). Could going back and forth between OS 9 and OS X be causing the repetition?

Any other ideas will be appreciated!

-- This script gets obits from the archive files and pastes them into the appropriate master web page in HTML format. ******

-- First initialize some variables
set theDate to the current date -- this is the current date
set theText to ""

-- *********** This section of the code establishes file paths and gets the header and footer data *******************

-- Set up various variables
set folderPath to "WinVault:News Copy: CURRENT ISSUE PAGES:PAGES PRINTED" -- This sets the directory for the pages.
set workFilePath to "Data:The Herald Independent:" -- File path for local work files
set theFolders to recentPaperDates() -- Gets a list of folders to look in for announcements
set theTopFile to workFilePath & "stdtop.txt" -- Page header
set theBotFile to workFilePath & "stdbot.txt" -- Page footer
set theObitFile to workFilePath & "obits.html" -- Obits file
set theEngFile to workFilePath & "engagements.html" -- Engagements file
set theWedFile to workFilePath & "weddings.html" -- Weddings file

-- Get the file text into variables
open for access alias theTopFile --Open the top page template
set theTopText to (read alias theTopFile as text from 1 to (get eof alias theTopFile)) -- read the file into a variable
close access alias theTopFile -- Close the top page template

open for access alias theBotFile -- Open the bottom page template
set theBotText to (read alias theBotFile as text from 1 to (get eof alias theBotFile)) -- read the file into a variable
close access alias theBotFile -- close the bottom page template

-- ********** This section creates the announcements files ********************************************************

set obitHeader to "<CENTER><H1>Obituaries</H1></CENTER><P><HR><P>" -- create Obits heading
set engHeader to "<CENTER><H1>Engagements</H1></CENTER><P><HR><P>" -- create Engagements heading
set wedHeader to "<CENTER><H1>Weddings</H1></CENTER><P><HR><P>" -- create Weddings heading

set theObits to obitHeader & getAnnouncements(theFolders, "obit", folderPath) -- assemble the obits into one variable
set theEng to engHeader & getAnnouncements(theFolders, "eng", folderPath) -- assemble the engagements into one variable
set theWed to wedHeader & getAnnouncements(theFolders, "wed", folderPath) -- assemble the weddings into one variable

copy theObits & return & "<hr>" & return & "Last Update: " & theDate & " - " & (current date) & "<P>" & return & return to theObitCode -- Add a "last updated" date stamp to the obits
copy theEng & return & "<hr>" & return & "Last Update: " & theDate & " - " & (current date) & "<P>" & return & return to theEngCode -- Add a "last updated" date stamp to the engagements
copy theWed & return & "<hr>" & return & "Last Update: " & theDate & " - " & (current date) & "<P>" & return & return to theWedCode -- Add a "last updated" date stamp to the weddings

set theNewCode to theTopText & return & theObitCode & return & theBotText -- Assemble the three pieces into a new variable

tell application "Finder"
	if (exists file theObitFile) then delete file theObitFile -- Erase the old file
end tell

--Write the new code to the obit file
tell application "Finder" to make file at workFilePath with properties {name:"obits.html", creator type:"ttxt", file type:"TEXT"}
open for access alias theObitFile with write permission -- create the file
write theNewCode to alias theObitFile -- write to the file
close access alias theObitFile -- close the file

-- Now send the obits file to the Web server. This was recorded from Fetch.
tell application "Fetch 4.0.3"
	-- removed for security
end tell

-- The above section will need to be rewritten to update the weddings and engagements, too
-- Classifieds use a different file structure so they'll need their own script.

-- ********** This section of the script defines our function handlers ****************************************

on getAnnouncements(theFolders, announceKey, folderPath)
	
	-- This function extracts the text from Quark and text files and assembles it into one big text chunk.
	-- Parameters:
	-- theFolders       - A list of the folders containing the announcements we want to run
	-- announceKey  - The keyword in the filename for the announcement type
	-- folderPath       - The path to the files
	
	-- Initialize some variables
	set returnValue to "" -- the return variable
	set theText to "" -- holds the announcements
	set theCode to "" -- holds the announcements after they've been formatted in HTML
	
	repeat with thePaper from 1 to (the number of words in theFolders) -- step through the paper dates
		
		-- Drill down to the " Gone Copy" folder. This expects a specific file structure and will fail if the structure is changed.
		copy folderPath & ":" & PaperYear(word thePaper of theFolders) & ":" & (word thePaper of theFolders) & ": gone copy" to announceFolderPath
		
		-- put the names of all the files in the "gone copy" folder into a variable
		copy (list folder announceFolderPath) to fileList
		
		repeat with theItem from 1 to (the number of items in fileList) --this repeat checks to see if we've got an announcement of the correct type
			
			set currentFile to announceFolderPath & ":" & (item theItem of fileList) -- assemble the file path
			
			if currentFile contains announceKey then -- yep, it's an announcement of the correct type
				
				tell application "Finder"
					set fileType to (file type of file currentFile) -- get the file type
				end tell
				
				if fileType is "XDOC" then -- it's a Quark file
					
					tell application "QuarkXPressâ„¢" -- this tell will extract the text of the obit
						activate
						open file (announceFolderPath & ":" & (item theItem of fileList)) -- open this particular announcement
						
						try -- it probably doesn't need to be this complicated but I copied this block from someone else's script
							repeat with theStory from 1 to count of stories in document 1
								if exists word 1 of story theStory of document 1 then
									copy the text of story theStory of document 1 to theText -- dump the text into a variable
								end if
							end repeat
						on error errMsg number errnum
							-- need to beware of problems here (disk full, ???)
							display dialog (errMsg & return & return & errnum) buttons {"Oops!"} default button 1 with icon 0
							return
						end try
						
						close document 1 -- close the obit file
					end tell -- end of getting a Quark file
					
				else -- it's not a Quark file, so...
					if fileType is "TEXT" then -- it's a text file
						open for access file currentFile -- open the file
						set theText to (read file currentFile as text from 1 to (get eof file currentFile))
						close access file currentFile
					else -- Error! It's neither Quark or text!
						set theText to ""
					end if --end of getting a text file
				end if -- end of the file type check
				
				-- Now format the text in HTML in a new variable theCode, which only means adding "<P>" tags at the graph breaks
				repeat with theGraph from 1 to (the number of paragraphs in theText)
					copy theCode & (paragraph theGraph of theText) & "<P>" & return to theCode
				end repeat
				
				copy (theCode & return & "<HR>" & return) to theCode -- get all the announcements together with a standard separator between them
				
			end if -- end of the check for the announcement keyword check
			
		end repeat -- end of stepping through the file list
		
		copy returnValue & theCode to returnValue -- Now put the announcement at the end of any previous announcements
		
	end repeat -- end of stepping through the paper dates
	
	return returnValue -- sends the announcement text chunk
	
end getAnnouncements

on recentPaperDates()
	-- this handler figures out the dates of the last month of papers
	--and returns them in a list.
	
	-- initialize some empty global variables
	global gPaperList
	copy "" to gPaperList
	
	-- Get the date range
	global gTodaysDate
	set gTodaysDate to the current date -- today
	global gStartDate
	set gStartDate to (the (current date) - (14 * days)) -- subtract one week
	
	-- Extract the paper dates
	set gTheDate to gStartDate -- initialize our counter
	
	repeat while gTheDate ≠ gTodaysDate
		if the weekday of gTheDate is Tuesday then -- it's Tuesday
			set gPaperList to (gPaperList & numberDate(gTheDate) & ", ") -- add this date to the paper list
		else
			if the weekday of gTheDate is Friday then -- it's Friday
				set gPaperList to (gPaperList & numberDate(gTheDate) & ", ") --add this date to the paper list
			end if
		end if
		
		set gTheDate to (gTheDate + (1 * days)) -- increment the counter
	end repeat
	
	return gPaperList
	
end recentPaperDates

on numberDate(theDate)
	-- This function converts the date to the six-digit format used in the HI archive
	set {day:d, year:y} to theDate
	copy theDate to b
	set b's month to January
	set m to (b - 2500000 - theDate) div -2500000
	
	return text 2 thru 7 of ((1000000 + m * 10000 + d * 100 + y mod 100) as string)
end numberDate

on PaperYear(theFolderName)
	-- this function extracts the year from the six-digit flder name
	
	set theyear to "20" & (characters 5 thru 6 of theFolderName)
	
	return theyear
	
end PaperYear

Model: Power Mac G4 500MHz
AppleScript: 1.9.3
Browser: Firefox 1.5.0.3
Operating System: Mac OS X (10.3.9)

Hi, Matt.

I think the cause of the repeated text is that you’re not resetting theCode to “” after this line near the end of getAnnouncements():

copy returnValue & theCode to returnValue -- Now put the announcements at the end of any previous announcements.

The next time round the repeat, theCode starts off with the value it picked up the first time, and that value is reappended to returnValue along with the next lot of code.

You can either add a line ‘set theCode to “”’ immediately after the line above, or simply move the initial setting to just inside the top of the repeat:

on getAnnouncements(theFolders, announceKey, folderPath)
	
	-- This function extracts the text from Quark and text files and assembles it into one big text chunk.
	-- Parameters:
	-- theFolders     - A list of the folders containing the announcements we want to run
	-- announceKey  - The keyword in the filename for the announcement type
	-- folderPath     - The path to the files
	
	-- Initialize the return value
	set returnValue to "" -- the return variable
	
	repeat with thePaper from 1 to (the number of words in theFolders) -- step through the paper dates
		
		-- Initialize these variables each time round the repeat
		set theText to "" -- holds the announcements
		set theCode to "" -- holds the announcements after they've been formatted in HTML
		
		-- Drill down to the " Gone Copy" folder. This expects a specific file structure and will fail if the structure is changed.
		copy folderPath & ":" & PaperYear(word thePaper of theFolders) & ":" & (word thePaper of theFolders) & ": gone copy" to announceFolderPath
		
		-- etc.

Success! The only repeat this time is the result of our having run the obit twice, so it’s in the files twice. Thanks!

Now I just need to add in the weddings and engagements, add the capablilty to read the RTF files that have suddenly started turning up occasionally (that’s where that double rule on the page came from – the script skipped the RTF file but drew the divider line), and then figure out what I need to change for the classified line ads, which have a slightly different file structure. By next week I should either have the site working correctly, or be back here with another bug. :slight_smile:

OK, I said I’d either have the site working or be back with another bug. SO… I’m back with another bug. :-\

I managed to get all three announcements working. The classified ads, however, use a different file structure in a different folder on the server. Instead of new files being generated for each paper, the same Quark files are simply modified each week. I adapted the getAnnouncements() function into getClassifieds(). I must have the same repeat bug as before. Maybe I’m blind, but I’m darned if I can find it.

Also, I’d like the classifieds to get put on the page in order, i.e. 100-Lots & Land, 105-Homes for sale, etc. The script keeps putting them in its own seemingly random order (by creation or mod date maybe?). How can I sort the files by filename so they’re read in the proper order?

Is there a better, more efficient way to integrate the classifieds with the announcements? If so, what is it?

The output of this new part of the revamped script is here: http://www.heraldindependent.com/html/classified.html

And here’s the getClassifieds() function. I basically just added a call to this function in the main script I posted before, and added some extra lines for the classifieds as appropriate. The announements are all working correctly, BTW – thanks for the help there!


on getClassifieds()
	-- This handler creates the classified file.
	-- There's probably a more efficient way to do this since there's a lot of the same code here as in getAnnouncements(), but...
	-- initialize some variables
	set theText to "" -- this will hold the text of the classified
	set theclassified to "" -- this will hold all of the classified that will appear on the page
	set theclassifiedCode to "" -- this will hold the classified in HTML
	set theHeading to "" -- this will hold the name of each classified section
	
	-- Set up the work files
	set folderPath to "WinVault:News Copy:CLASSIFIEDS" -- hard coded
	
	set fileList to (list folder folderPath) -- put the names of all the files in the class folder into a variable
	
	-- Use Quark to extract the info into a variable
	repeat with theFile from 1 to (the number of items in fileList)
		if (item theFile of fileList) contains "-" then -- checks to make sure it's a valid classified file
			tell application "QuarkXPressâ„¢"
				activate
				open file (folderPath & ":" & (item theFile of fileList))
				try -- it probably doesn't need to be this complicated but I copied this block from someone else's script
					repeat with theStory from 1 to count of stories in document 1
						if exists word 1 of story theStory of document 1 then
							copy "<h3>" & (item theFile of fileList) & "</h3>" & return & "<P>" & return to theHeading -- add the name of this section to the top of this section
							
							copy (theHeading & the text of story theStory of document 1) to theText -- dump the text into a variable
						end if
					end repeat
				on error errMsg number errnum
					-- need to beware of problems here (disk full, ???)
					display dialog (errMsg & return & return & errnum) buttons {"Oops!"} default button 1 with icon 0
					return
				end try
				
				close document 1 -- close the file
				
			end tell
		end if
		
		set theclassified to (theclassified & return & "<HR>" & return & theText) -- get all the ads together with a standard separator between them
		
	end repeat
	
	-- Now format the text in HTML in a new variable theclassifiedCode, which only means adding "<P>" tags at the graph breaks
	repeat with theGraph from 1 to (the number of paragraphs in theclassified)
		copy theclassifiedCode & (paragraph theGraph of theclassified) & "<P>" & return to theclassifiedCode
	end repeat
	
	
	return theclassifiedCode -- sends back the whole shebang, hopefully nicely formatted in HTML
	
end getClassifieds

Model: Power Mac G4 500MHz
AppleScript: 1.9.3
Browser: Firefox 1.5.0.3
Operating System: Mac OS X (10.3.9)

That’s very perplexing, Matt. :confused:

I can’t see anything there that would cause the text of individual files to be included several times in the output. Even in the repeat through the ‘stories’ (whatever they are) in each QuarkXPress document, the variables are set afresh each time.

Also, the names in a list returned by ‘list folder’ are almost invariably in lexical order. (They’re not documented to be so, it just happens.) If your file names all begin with three-digit numbers, they should be listed ” and thus processed ” in the order you want.

I suppose the next thing to do is to check the files themselves. Is there more than one copy of each in the folder? Open them manually in QuarkXPress and see if the text is repeated in them.

Then run bits of the handler at a time. Paste it into a separate Script Editor document and add a ‘getClassifieds()’ call after it. Insert a line ‘return result’ after the ‘list folder’ line, run the script, and see what appears in the ‘Result’ pane at the bottom of the Script Editor window. If it’s what you’re expecting, delete the ‘return result’ line and insert it after the next place in the script where a variable is set. And so on.

The ‘list folder’ function returns the names of invisible files too by default. You could change the line where you call it to make sure that only visible files are included:

set fileList to (list folder folderPath without invisibles)

Oh, great. I’ve stumped an expert. :frowning:

That was one of the first things I checked. The text is in each file only once. If it was repeated in the files, it’d be repeated in the physical paper, too, and I’d get yelled at. :wink:

FYI, here’s the part of the Event Log that returns the folder list (I made the change you suggested). Notice that the files are in some screwy order. In a much earlier attempt at this script I tried using the sort command; IIRC it had no effect. In this list you can also see why I test for “-” in the filename, as I have a couple template files in the folder. As far as I can tell from the event log, these template files are not being read so that if-then test is working as expected.

There’s one Classified file that has two stories – there’s a template for the Creditors on the pasteboard which is being included. Here’s the typical output from one of the smaller Classified files, “105-Homes for sale”:

I figured the Event Log gives pretty much the same info as logging in the Result, I just have to plow through everything.

As you can see from the above examples, there aren’t any duplicate files and the text being returned from each file is what I’m expecting with each file being listed and read only once. The HTML being written to the file has duplicate enties, though. The problem has to be somewhere in the creation of the HTML in the getClassifieds() handler. One thought that’s occurred to me is that getAnnouncements() adds

tags and then


tags while getClassifieds does things the other way around – but I can’t see how that could matter.

Could my whole algorithm be at fault? Is there a better way to accomplish the same result?

One other unrelated question – can the text of an RTF file be retrieved and manipulated directly the way a plain text file can? What file type would I use to get the script to recognize RTF files? An obit was emailed to me and I had to edit it in TextEdit, which of course won’t save plain text files, so now I have one RTF obit file I have to deal with.

Model: Power Mac G4 500MHz
AppleScript: 1.9.3
Browser: Firefox 1.5.0.3
Operating System: Mac OS X (10.3.9)

But not for long! :smiley: It’s so obvious when you see it. The line that sets theclassified (between the ‘end if’ and the ‘end repeat’) should be up there in the inner ‘if’ block, so that it only gets set when there’s a hit on a story. Otherwise, when there’s no hit, it has the previous story tagged onto it again. Duh!

on getClassifieds()
	-- This handler creates the classified file.
	-- There's probably a more efficient way to do this since there's a lot of the same code here as in getAnnouncements(), but...
	-- initialize some variables
	set theText to "" -- this will hold the text of the classified
	set theclassified to "" -- this will hold all of the classified that will appear on the page
	set theclassifiedCode to "" -- this will hold the classified in HTML
	set theHeading to "" -- this will hold the name of each classified section
	
	-- Set up the work files
	set folderPath to "WinVault:News Copy:CLASSIFIEDS" -- hard coded
	
	set fileList to (list folder folderPath) -- put the names of all the files in the class folder into a variable
	
	-- Use Quark to extract the info into a variable
	repeat with theFile from 1 to (the number of items in fileList)
		if (item theFile of fileList) contains "-" then -- checks to make sure it's a valid classified file
			tell application "QuarkXPressâ„¢"
				activate
				open file (folderPath & ":" & (item theFile of fileList))
				try -- it probably doesn't need to be this complicated but I copied this block from someone else's script
					repeat with theStory from 1 to count of stories in document 1
						if exists word 1 of story theStory of document 1 then
							copy "<h3>" & (item theFile of fileList) & "</h3>" & return & "<P>" & return to theHeading -- add the name of this section to the top of this section
							
							copy (theHeading & the text of story theStory of document 1) to theText -- dump the text into a variable
							-- DO THIS HERE!
							set theclassified to (theclassified & return & "<HR>" & return & theText) -- get all the ads together with a standard separator between them
						end if
					end repeat
				on error errMsg number errnum
					-- need to beware of problems here (disk full, ???)
					display dialog (errMsg & return & return & errnum) buttons {"Oops!"} default button 1 with icon 0
					return
				end try
				
				close document 1 -- close the file
				
			end tell
		end if
		
	end repeat
	
	-- Now format the text in HTML in a new variable theclassifiedCode, which only means adding "<P>" tags at the graph breaks
	repeat with theGraph from 1 to (the number of paragraphs in theclassified)
		copy theclassifiedCode & (paragraph theGraph of theclassified) & "<P>" & return to theclassifiedCode
	end repeat
	
	
	return theclassifiedCode -- sends back the whole shebang, hopefully nicely formatted in HTML
	
end getClassifieds

Hopefully that’ll cure the repeated text problem.

Your basic algorithm seems to be OK. It could be made more efficient by reducing the amount of concatenation and by using ‘set’ instead of ‘copy’. Here’s how I’d probably code what you’ve written, but since I don’t have either QuarkXpress or your files, I haven’t been able to test it to make sure it’s completely bug-free. Since ‘list folder’ obviously isn’t returning the file names in lexical order, I’ve included a sort handler to look after that. If the files are still processed in the wrong order after that, there may be funny characters in the names themselves.


on getClassifieds()
	-- This handler creates the classified file.
	
	set theclassified to "" -- this will hold all of the classified that will appear on the page. Initialise it to "".
	
	-- Set up the work files
	set folderPath to "WinVault:News Copy:CLASSIFIEDS:" -- hard coded
	
	set fileList to (list folder folderPath without invisibles) -- put the names of all the files in the class folder into a variable
	qsort(fileList, 1, -1) -- make sure the list is in lexical order
	
	-- Create a template list, to be used for each section. (Reduces concatenation waste.)
	-- Item 2 will take the name of each section. Item 4 will take the text.
	set textTemplate to {return & "<HR>" & return & "<h3>", missing value, "</h3>" & return & "<P>" & return, missing value}
	
	set astid to AppleScript's text item delimiters
	set AppleScript's text item delimiters to ""
	
	-- Use Quark to extract the info into a variable
	repeat with theFile from 1 to (count fileList)
		set thisFileName to item theFile of fileList
		if (thisFileName contains "-") then -- checks to make sure it's a valid classified file
			tell application "QuarkXPressâ„¢"
				activate
				open file (folderPath & fileName)
				try -- it probably doesn't need to be this complicated but I copied this block from someone else's script
					repeat with theStory from 1 to count of stories in document 1
						if exists word 1 of story theStory of document 1 then
							set item 2 of textTemplate to thisfilename -- insert the name of this section into the template
							set item 4 of textTemplate to the text of story theStory of document 1 -- insert the text
							set theclassified to theclassified & textTemplate -- concatentate the template to the total text so far
						end if
					end repeat
				on error errMsg number errnum
					-- need to beware of problems here (disk full, ???)
					display dialog (errMsg & return & return & errnum) buttons {"Oops!"} default button 1 with icon 0
					return
				end try
				
				close document 1 -- close the file
				
			end tell
		end if
	end repeat
	
	-- Now format the text in HTML in a new variable theclassifiedCode, which only means adding "<P>" tags at the graph breaks
	set theclassifiedCode to theclassified's paragraphs
	set AppleScript's text item delimiters to "<P>" & return
	set theclassifiedCode to theclassifiedCode as text
	
	set AppleScript's text item delimiters to astid
	
	return theclassifiedCode -- sends back the whole shebang, hopefully nicely formatted in HTML
	
end getClassifieds

on qsort(theList, l, r)
	script o
		property cutoff : 10
		property p : theList
		
		on qsrt(l, r)
			set i to l
			set j to r
			set v to my p's item ((l + r) div 2)
			
			repeat while (j > i)
				set u to my p's item i
				repeat while (u < v)
					set i to i + 1
					set u to my p's item i
				end repeat
				
				set w to my p's item j
				repeat while (w > v)
					set j to j - 1
					set w to my p's item j
				end repeat
				
				if (i > j) then
				else
					set my p's item i to w
					set my p's item j to u
					set i to i + 1
					set j to j - 1
				end if
			end repeat
			
			if (j - l < cutoff) then
			else
				qsrt(l, j)
			end if
			
			if (r - i < cutoff) then
			else
				qsrt(i, r)
			end if
		end qsrt
		
		on isrt(l, r)
			set x to l
			set z to l + cutoff - 1
			if (z > r) then set z to r
			
			set v to my p's item x
			repeat with y from (x + 1) to z
				if (my p's item y < v) then
					set x to y
					set v to my p's item y
				end if
			end repeat
			
			tell my p's item l
				set my p's item l to v
				set my p's item x to it
			end tell
			
			set u to my p's item (l + 1)
			repeat with i from (l + 2) to r
				set v to my p's item i
				if (v < u) then
					set my p's item i to u
					repeat with j from (i - 2) to l by -1
						if (v < my p's item j) then
							set my p's item (j + 1) to my p's item j
						else
							set my p's item (j + 1) to v
							exit repeat
						end if
					end repeat
				else
					set u to v
				end if
			end repeat
		end isrt
	end script
	
	set listLen to (count theList)
	if (listLen > 1) then -- otherwise the handler will error
		-- Translate negative indices
		if (l < 0) then set l to listLen + l + 1
		if (r < 0) then set r to listLen + r + 1
		
		if (r = l) then
			-- No point in sorting just one item
		else
			-- Transpose transposed indices
			if (l > r) then
				set temp to l
				set l to r
				set r to temp
			end if
			
			if (r - l < o's cutoff) then
				-- Skip the Quicksort if cutoff or less items
			else
				o's qsrt(l, r)
			end if
			o's isrt(l, r)
		end if
	end if
	
	return -- nothing
end qsort

Sorry. I forgot to reply to this bit:

You could use TextEdit to get the text and use the file write command from the StandardAdditions to save it.

tell application "TextEdit"
	open file pathToMyRTFFile
	-- A trick to convert (convertible) Unicode to plain text.
	set {text:plainText} to (text of front document) as text
end tell

-- Either use plainText directly, or save it to a file as follows:

set fRef to (open for access file pathToMyTextFile with write permission)
try
	set eof fRef to 0
	write plainText to fRef
end try
close access fRef

I took your code and replaced mine. You had one misnamed variable, which was easy to find. Once that was fixed it worked perfectly. I now have all three announcement types and the classified ads working properly, more or less. Look at www.heraldindependent.com to see the results of all this work.

That’s one heckuva sort handler!!! I guess that’s part of why I’m such a poor programmer, I tend to naievely think I can do something with one or two lines when it’s best done with a huge subroutine like that. Anyway, thanks for the help.

I’ve also added a blog entry on our site thanking you for all your help. I could never have gotten the site working without the generous help you’ve given! Many, many thanks! I think I can handle the few little odds and ends I still need to add now.

Hi, Matt. I’m glad it’s all working now. Sorry about the variable.

There are shorter sort handlers. :slight_smile: That one’s fast and works on any version of Mac OS.

Thanks for the mention in your blog. Good luck with the remaining odds and ends.