Quark 5.0 extract text

Hello. I am new to applescripting but I wondered if someone could help me with this problem:

I have a significant amount of documents of several pages each from which I would like to extract the text that is styled with a specific “title” style from the stylesheet.

Any help on how would I go about doing this.

thank you in advance.

Rita

A little more info on what format you need the text saved out to would help. One way to do this would be to step through each page and text box, test for the paragraph style, if it meets the test save to a string, then write that string to a text file.

I would just need the text in “plain text” format, with each instance separated by a hard return – it is to go into a database eventually.

Hope that helps,

Rita

Something like this should work (of course substitute your own text file path) - I’m making some guesses about exactly what you want. Actually, I ran this in 4.x under OS9.1 - my Classic 9.2 which is where I have 5.x is on the blink, but this should work anyhow. Could be more efficient…


property txtFilePath : "Disk o' Dan:TEST:Extracted Text Files:"

on open dl
	set saveDelims to AppleScript's text item delimiters
	set AppleScript's text item delimiters to {":"}
	set tl to {}
	repeat with td in dl
		tell application "QuarkXPress™ 4.11"
			open td use doc prefs yes remap fonts no do auto picture import no without reflow
			activate
			tell document 1
				
				set titleIndex to 0
				repeat with ss from 1 to count of style spec
					--display dialog "" & name of style spec ss
					if name of style spec ss is "title" then
						set titleIndex to ss
						exit repeat
					end if
				end repeat
				
				if titleIndex is 0 then
					display dialog "Can't find title style sheet."
				else
					
					
					set txtFile to text item -1 of (td as text) & ".txt"
					tell application "Finder" to make file at alias txtFilePath with properties {name:txtFile}
					set fref to (open for access alias (txtFilePath & txtFile) with write permission)
					repeat with sprd from 1 to count of spreads
						tell spread sprd
							repeat with tbox from 1 to count of text boxes
								tell text box tbox
									repeat with para from 1 to count of paragraphs
										if index of style sheet of paragraph para is titleIndex then
											set s to paragraph para as text
											write s & return to fref
										end if
									end repeat
								end tell
							end repeat
						end tell
					end repeat
					close access fref
				end if
			end tell
			close document 1
		end tell
	end repeat
	set AppleScript's text item delimiters to saveDelims
end open

HTH -

  • Dan