I have a working applescript that finds a paragraph of copy based on the first word of that paragraph.
This works fine.
tell application "TextEdit"
activate
tell document 1
set textToCopy to (get every paragraph whose first word is "something")
end tell
end tell
But I have one line of text I need to get that always begins with PQ- (as in PQ-FLW or PQ-MINT) and its position changes from document to document, as does its number of characters, so I can’t get it by its location.
This script seems logical and compiles, but it doesn’t return anything.
tell application "TextEdit"
activate
tell document 1
set textToCopy to get every word whose characters 1 thru 3 is "PQ-"
end tell
end tell
I’ve tried it using “every word” and “every paragraph”. Can someone shed some light on this for me?
tell application "TextEdit"
activate
tell document 1
set textToCopy to paragraphs whose (first word is "PQ") and (third character is "-")
end tell
end tell
How is a line distinguished from a paragraph - do only paragraphs have returns as their last character and the lines are just wrapped by TextEdit? You use the word “line”, but I don’t know what a “line” is in your text.
If you just want to isolate any paragraph containing “PQ-” anywhere and there will be only one, then offset works:
set Text1 to "This is several lines of text
with a PQ-Something embedded in it.
PQ-other is another example
The text goes on."
set Text2 to "This text doesn't contain the
magic characters."
set char to "PQ-" as string
offset of char in Text1 --> 38
offset of char in Text2 --> 0
The challenge, though, is that if a paragraph (a group of sentences wrapped on the screen but containing only one return character - i.e. a big string) is part of a group of paragraphs and contains the characters PQ-ArbitraryText, what’s the easiest way to identify that paragraph. As I interpreted his quandary, the paragraph doesn’t necessarily start with those letters. The first part of his post is a red herring - it works. It seemed to me that the second is the question I’ve posed. TIDs could find out, but there must be a simpler way. If it truly is just a string, then contains works. I’ll await clarification.
Ok this should do it if the word may not be the first word.
tell application “TextEdit”
tell front document
set the_word to first word where it begins with “PQ-”
end tell
set t to text of front document
end tell
set o to offset of the_word in t
set sub_t to text 1 thru o of t
set c to (count paragraphs of sub_t)
set the_paragraph to paragraph c of t
The inference that I (and apparently others) drew, differs slightly. But I agree with your final point - we can speculate about different interpretations until the cows come home (or the OP clarifies the position). I was merely offering a suggestion that followed an existing line of reasoning - but that had not yet been specifically covered.
My last script still wasn’t right. Now I’ve got it. This gets the paragraph that contains some word beginning with “PQ-”
tell application “TextEdit”
set the_word to first word of front document where it begins with “PQ-”
set p to first paragraph of front document where some word of it is the_word
end tell
Edited: now to crunch it up:
tell application “TextEdit”
set p to first paragraph of front document where some word of it begins with “PQ-”
end tell
So Kai solves the problem neatly if the paragraph starts with “PQ-”, and Kel does if the problem is as I saw it. Neat both ways, but I think Kel solves it no matter what. Nice.
Not here, strangely enough. OMM, AppleScript appears to regard hyphens as word separators in Unicode text, and connectors in plain or international text. Since my text comes from TextEdit as Unicode text, the script sees no words starting with “PQ-”, but will readily identify any that are “PQ”.
Nevertheless, Kel’s script evidently works for both Kel and Adam, so I need to do some more homework to identify why my setup (Mac OS X 10.4.5, AS 1.10.3, TextEdit 1.4) should behave differently. In this situation, though, a reasonably effective method is to pull the text into AppleScript and perform any extractions there, instead:
on words_started by s from t
set d to text item delimiters
set text item delimiters to s
set t to rest of t's text items
repeat with i in t
tell i's word 1 to if it is i's text 1 thru word 1 then
set i's contents to {""} & it as Unicode text
else
set i's contents to false
end if
end repeat
set text item delimiters to d
t's every Unicode text
end words_started
words_started by "PQ-" from text of document 1 of application "TextEdit"
I mislead you, Kai. I didn’t try it with text from TextEdit - I just used an AppleScript variable with a bunch of dummy stuff and PQ-stuff stuck in the middle. Sorry. Didn’t test it with Unicode text.
Thanks for all the help, its a little overwhelming. Allow me to clarify my quest.
My text document contains info for each customers letter and two-sided insert. So when I pull the copy I need into the insert I don’t need all the copy, just a couple of paragraphs and a code that is PQ- (for all customers) proceeded by the unique client code. So, the code is PQ-FLW or PQ-MINT, etc, etc. The code is on different lines due to variances in the amount of text per client.
The way I am locating the text is by having the copywriter put a full return before every paragraph as well as the code. The code is on a line by itself, but not the same line from document to document.
My solution is to have the copywriter put the code as the very last line of text, I can get that.
I am still interested, however in being able to locate that darned code by the PQ- which is common to all codes, and then select the entire text on that line. Then it won’t matter where the code falls.
That’s strange. I can’t get TextEdit to return text that treats hyphens as word separators. It must be the version 1.2 unless it’s some setting on my computer for default text encoding.
I never trust “words” to return the same thing, or even what I expect. I view the words command as something you use when you don’t care about precise work, just returning a simple bit of text whose precision is unimportant. In other words, I almsot never use it.
What I use when I want to extract a substring is a custom handler I wrote called getTextBetween. Now, it lets you have source text and look for the first occurrence of some starting text and find everything from there up to some ending text. One important note: it does NOT return the “starting text” as part of the result - you’d have to add that back on yourself. In the example I included, if you wanted “Contact:Jeff Robertson” you need to put the “Contact:” back on to the beginning of what this returns. Also in my example, note that you have multiple places where there is text between “:” and a return. The handler would actually let you specify to get other occurrences. Since what the handler does is split the text up into pieces, the first “between” text is the 2nd text item. So, you wanted to get “856-321-4567” in my example, you call the handler like this:
set someText to "Contact:Jeff Robertson" & return & "Phone:856-321-4567" & return & "Fax:856-321-9999"
getTextBetween({sourceText:someText, beforeText:":", afterText:return})
--> "Jeff Robertson"
on getTextBetween(prefs)
-- version 1.4, Daniel A. Shockley <http://www.danshockley.com>
-- gets the text between specified occurrence of beforeText and afterText in sourceText
-- the default textItemNum should be 2
set defaultPrefs to {textItemNum:2}
if (class of prefs is not list) and (class of prefs is not record) then
error "getTextBetween FAILED: parameter should be a record or list. If it is multiple items, just make it into a list to upgrade to this handler." number 1024
end if
if class of prefs is list then
if (count of prefs) is 4 then
set textItemNum of defaultPrefs to item 4 of prefs
end if
set prefs to {sourceText:item 1 of prefs, beforeText:item 2 of prefs, afterText:item 3 of prefs}
end if
set prefs to prefs & defaultPrefs -- add on default preferences, if needed
set sourceText to sourceText of prefs
set beforeText to beforeText of prefs
set afterText to afterText of prefs
set textItemNum to textItemNum of prefs
try
set oldDelims to AppleScript's text item delimiters
set AppleScript's text item delimiters to the beforeText
set the prefixRemoved to text item textItemNum of sourceText
set AppleScript's text item delimiters to afterText
set the finalResult to text item 1 of prefixRemoved
set AppleScript's text item delimiters to oldDelims
on error errMsg number errNum
set AppleScript's text item delimiters to oldDelims
-- tell me to log "Error in getTextBetween() : " & errMsg
set the finalResult to "" -- return nothing if the surrounding text is not found
end try
return finalResult
end getTextBetween
You can also call it with a simple list of three items as the parameter:
getTextBetween({someText, “:”, return})
That becomes a little too troublesome for what I need, so I chose to have my copywriter place the elusive text at the end of the word document and I can get it as “last paragraph”, clean and sweet. Here’s my script.
--this gets the word file and flows the text into the text box--
tell application "TextEdit"
activate
tell document 1
set textToCopy to (get every paragraph whose first word is "Pay")
end tell
end tell
--this opens the Quark file and flows the text into the text box--
tell application "Finder"
activate
open document file "test_file.qxd" of folder "Desktop" of folder "me" of folder "Users" of startup disk
end tell
tell application "QuarkXPress"
tell document 1
set story 1 of text box 1 to textToCopy
end tell
end tell
tell application "TextEdit"
activate
tell document 1
set textToCopy to (get every paragraph whose first word is "Whatever")
end tell
end tell
tell application "QuarkXPress"
tell document 1
set story 1 of text box 2 to textToCopy
end tell
end tell
--This particular text field doesnt get printed on press, but it identifies this piece by the customer code on the print-outs on page 1(there are over 100 customers). The customer code doesn't appear by itself anywhere in the word document but it does appear in the email address. So I flow the email address into the Quark text box and have Quark delete the email info that preceeds the customer code number --
tell application "TextEdit"
activate
tell document 1
set textToCopy to (get every paragraph whose first word is "clientname")
end tell
end tell
tell application "QuarkXPress"
tell document 1
set story 1 of text box 3 to textToCopy
delete (characters 1 thru 11) of story 1 of text box 3
end tell
end tell
--here is where I get the "PQ-blahblah" code that I had the copywriter move to the last line so I could find it.
tell application "TextEdit"
activate
tell document 1
set textToCopy to (get last paragraph)
end tell
end tell
tell application "QuarkXPress"
tell document 1
tell page 2
set story 1 of text box 1 to textToCopy
end tell
end tell
end tell
--this text is where I insert the actual email address. I use it below again just to seperate out the customer code--
tell application "TextEdit"
activate
tell document 1
set textToCopy to (get every paragraph whose first word is "clientname")
end tell
end tell
--This particular text field doesnt get printed on press, but it identifies this piece by the customer code on the print-outs on page 2. The customer code doesn't appear by itself anywhere in the word document but it does appear in the email address. So I flow the email address into the Quark text box and have Quark delete the email info that preceeds the customer code number --
--this opens the Quark file and flows the text into the text box
tell application "QuarkXPress"
tell document 1
tell page 2
set story 1 of text box 2 to textToCopy
end tell
end tell
end tell
tell application "QuarkXPress"
tell document 1
tell page 2
set story 1 of text box 3 to textToCopy
delete (characters 1 thru 11) of story 1 of text box 3
end tell
end tell
end tell
tell application "TextEdit"
close document 1
end tell
--this action locates the Quark file opened above for Quark to access
tell application "Finder"
activate
open document file "Test_file.qxd" of folder "Desktop" of folder "dskipworth" of folder "Users" of startup disk
end tell
--this action opens a dialog box for the user to name the file. Then it saves the file in the appropriate folder and closes it
tell application "QuarkXPress"
activate
get document 1
set doc to document 1
display dialog ("save doc as") default answer "8483_" with icon note
set nom to text returned of result
save doc in ("Macintosh HD:Users:me:Desktop:FileTarget:" & nom)
close document 1
end tell
Now, here’s my next delimna. I am using a drag & drop to drag each Microsoft Word document onto the script.
In order for TextEdit to get the text and flow it into Quark I need to get it into Plain Text. I can either save the text to a new TextEdit document or to the clipboard and pull it off in a script and put it into the Quark doc without creating an itermediary TextEdit document.
Here’s what I have so far and I can’t seem to get the word doc text and do anything with it once I select it and get it highlighted. I think I need to send it to the clipboard but this script returns nothing
(*on open some_items
tell application "TextEdit"
open every item of some_items
end tell
end open*)
--I need a script here to tell Microsoft Word to get this file which was drag & dropped thru the script above--
tell application "Microsoft Word"
launch
activate
do Visual Basic "Selection.WholeStory"
--this script to set the text to the clipboard returns nothing--
set the clipboard to selection as string
end tell
end tell
tell application "TextEdit"
set the clipboard to «class ktxt» of ((the clipboard as text) as record)
return (the clipboard)
set document 1 to the clipboard
end tell