I am sure I was set this question in my computer studies a-level back in 1979. I was lucky to have been taught by Miss Judith Jolly. She bought an Apple II for the Tavistock Comprehensive School. A great inspiration but never did grasp the fundamentals of sorting data!!!
I need to extract text: for example 1000 characters and then paste it each time into a new document. The source is a page of text without any commas or full stops. (although I might later try to write a script which ‘delimits’? between the names and puts a comma).
Any assistance would be gratefully received. I have really got into this Applescript. Some of my dreams are coming true. Whist other times I lay there in a ‘for next loop’!
Thanks Pete
Pro FCP VFX editor Pinewood UK
-- A script to copy text into a new document with a maximum number of 50 characters and no trauncated words extracted from a simple text file.
-- list from textEDIT doc
set LongPlayersList to "Jerzy Dudek Steve Finnan Sami Hyypia Daniel Agger John Arne Riise Harry Kewell Steven Gerrard Robbie Fowler Luis Garcia Mark Gonzalez " --Fabio Aurelio Xabi Alonso Peter Crouch Jermaine Pennant Craig Bellamy Dirk Kuyt Momo Sissoko Jamie Carragher Pepe Reina Paul Anderson Stephen Warnock Gabriel Paletta Boudewijn Zenden Danny Guthrie Adam Hammill Lee Peltier Craig Lindfield"
-- to indicate
set characterCount to count of characters of LongPlayersList
display dialog ("Number of characters in this selection is " & characterCount)
if characterCount is greater than 200 then
set PlayersList1 to (characters 1 thru 50 of LongPlayersList) as string
set PlayersList2 to (characters 51 thru 100 of LongPlayersList) as string
set PlayersList3 to (characters 101 thru 150 of LongPlayersList) as string
set PlayersList4 to (characters 151 thru 200 of LongPlayersList) as string
display dialog ("PlayersList1" & PlayersList1)
display dialog ("PlayersList2" & PlayersList2)
display dialog ("PlayersList3" & PlayersList3)
display dialog ("PlayersList4" & PlayersList4)
-- create new text document and paste text of PlayersList1
-- create new text document and paste text of PlayersList2
-- create new text document and paste text of PlayersList3
-- create new text document and paste text of PlayersList4
else
if characterCount is greater than 150 then
set PlayersList1 to (characters 1 thru 50 of LongPlayersList) as string
set PlayersList2 to (characters 51 thru 100 of LongPlayersList) as string
set PlayersList3 to (characters 101 thru 150 of LongPlayersList) as string
set PlayersList4 to (characters 151 thru characterCount of LongPlayersList) as string
display dialog ("PlayersList1" & PlayersList1)
display dialog ("PlayersList2" & PlayersList2)
display dialog ("PlayersList3" & PlayersList3)
display dialog ("PlayersList4" & PlayersList4)
-- create new text document and paste text of PlayersList1
-- create new text document and paste text of PlayersList2
-- create new text document and paste text of PlayersList3
-- create new text document and paste text of PlayersList4
else
if characterCount is greater than 100 then
set PlayersList1 to (characters 1 thru 50 of LongPlayersList) as string
set PlayersList2 to (characters 51 thru 100 of LongPlayersList) as string
set PlayersList3 to (characters 101 thru characterCount of LongPlayersList) as string
display dialog ("PlayersList1" & PlayersList1)
display dialog ("PlayersList2" & PlayersList2)
display dialog ("PlayersList3" & PlayersList3)
-- create new text document and paste text of PlayersList1
-- create new text document and paste text of PlayersList2
-- create new text document and paste text of PlayersList3
else
if characterCount is greater than 50 then
set PlayersList1 to (characters 1 thru 50 of LongPlayersList) as string
set PlayersList2 to (characters 51 thru characterCount of LongPlayersList) as string
display dialog ("PlayersList1" & PlayersList1)
display dialog ("PlayersList2" & PlayersList2)
-- create new text document and paste text of PlayersList1
-- create new text document and paste text of PlayersList2
set the numberDocs to 2
else
if characterCount is greater than 0 then
set PlayersList1 to (characters 1 thru characterCount of LongPlayersList) as string
display dialog ("PlayersList1" & PlayersList1)
-- create new text document and paste text of PlayersList1
set the numberDocs to 1
end if
end if
end if
end if
end if
Interesting problem. This isn’t optimized for speed, but it works for the list of words you provided. allDocs is a list of 50 character or less strings that don’t start or end with a space and don’t split a word.
set LPL to "Jerzy Dudek Steve Finnan Sami Hyypia Daniel Agger John Arne Riise Harry Kewell Steven Gerrard Robbie Fowler Luis Garcia Mark Gonzalez Fabio Aurelio Xabi Alonso Peter Crouch Jermaine Pennant Craig Bellamy Dirk Kuyt Momo Sissoko Jamie Carragher Pepe Reina Paul Anderson Stephen Warnock Gabriel Paletta Boudewijn Zenden Danny Guthrie Adam Hammill Lee Peltier Craig Lindfield"
set allDocs to {}
set W to words of LPL
set k to 1
repeat
try
set P to characters k thru (k + 49) of LPL as string
if first character of P is space then set P to (characters 2 thru -1 of P) as text
if last character of P is space then set P to (characters 1 thru -2 of P) as text
if last word of P is not in W then -- we've split a word
set P to words 1 thru -2 of P
set AppleScript's text item delimiters to space
set P to P as text
set AppleScript's text item delimiters to ""
end if -- now find out where we left off
set k to (offset of (last word of P) in LPL) + (length of last word of P)
set end of allDocs to P -- stick it in our list of strings
on error -- we've hit the end of the original string
set end of allDocs to characters (k + 1) thru -1 of LPL as text
exit repeat
end try
end repeat
--> {"Jerzy Dudek Steve Finnan Sami Hyypia Daniel Agger", "John Arne Riise Harry Kewell Steven Gerrard", "Robbie Fowler Luis Garcia Mark Gonzalez Fabio", "Aurelio Xabi Alonso Peter Crouch Jermaine Pennant", "Craig Bellamy Dirk Kuyt Momo Sissoko Jamie", "Carragher Pepe Reina Paul Anderson Stephen", "Warnock Gabriel Paletta Boudewijn Zenden Danny", "Guthrie Adam Hammill Lee Peltier Craig Lindfield"}
It is trully amazing who you have turned this on its head to find the solution!
But how do I get the four separate resultant documents in this example?
set LPL to "Jerzy Dudek Steve Finnan Sami Hyypia Daniel Agger John Arne Riise Harry Kewell Steven Gerrard Robbie Fowler Luis Garcia Mark Gonzalez Fabio Aurelio Xabi Alonso Peter Crouch Jermaine Pennant Craig Bellamy Dirk Kuyt Momo Sissoko Jamie Carragher Pepe Reina Paul Anderson Stephen Warnock Gabriel Paletta Boudewijn Zenden Danny Guthrie Adam Hammill Lee Peltier Craig Lindfield"
set allDocs to {}
set W to words of LPL
set k to 1
repeat
try
set P to characters k thru (k + 99) of LPL as string
if first character of P is space then set P to (characters 2 thru -1 of P) as text
if last character of P is space then set P to (characters 1 thru -2 of P) as text
if last word of P is not in W then -- we've split a word
set P to words 1 thru -2 of P
set AppleScript's text item delimiters to space
set P to P as text
set AppleScript's text item delimiters to ""
display dialog allDocs as text
end if -- now find out where we left off
set k to (offset of (last word of P) in LPL) + (length of last word of P)
set end of allDocs to P -- stick it in our list of strings
on error -- we've hit the end of the original string
set end of allDocs to characters (k + 1) thru -1 of LPL as text
exit repeat
end try
end repeat
-- eg
--> doc1{Jerzy Dudek Steve Finnan Sami Hyypia Daniel Agger John Arne Riise Harry Kewell Steven Gerrard Robbie}
--> doc2 {Fowler Luis Garcia Mark Gonzalez Fabio Aurelio Xabi Alonso Peter Crouch Jermaine Pennant....}
--> doc3 {Craig Bellamy Dirk Kuyt Momo Sissoko Jamie Carragher Pepe Reina Paul Anderson Stephen...}
--> doc4 {Warnock Gabriel Paletta Boudewijn Zenden Danny Guthrie Adam Hammill Lee Peltier Craig Lindfield}
Like so… (and I didn’t “turn it on it’s head”, I just pursued what you stated as the conditions on each doc)
set LPL to "Jerzy Dudek Steve Finnan Sami Hyypia Daniel Agger John Arne Riise Harry Kewell Steven Gerrard Robbie Fowler Luis Garcia Mark Gonzalez Fabio Aurelio Xabi Alonso Peter Crouch Jermaine Pennant Craig Bellamy Dirk Kuyt Momo Sissoko Jamie Carragher Pepe Reina Paul Anderson Stephen Warnock Gabriel Paletta Boudewijn Zenden Danny Guthrie Adam Hammill Lee Peltier Craig Lindfield"
set allDocs to {}
set W to words of LPL
set k to 1
repeat
try
set P to characters k thru (k + 99) of LPL as string
if first character of P is space then set P to (characters 2 thru -1 of P) as text
if last character of P is space then set P to (characters 1 thru -2 of P) as text
if last word of P is not in W then -- we've split a word
set P to words 1 thru -2 of P
set AppleScript's text item delimiters to space
set P to P as text
set AppleScript's text item delimiters to ""
end if -- now find out where we left off
set k to (offset of (last word of P) in LPL) + (length of last word of P)
set end of allDocs to P -- stick it in our list of strings
on error -- we've hit the end of the original string
set end of allDocs to characters (k + 1) thru -1 of LPL as text
exit repeat
end try
end repeat
-- publish the docs
repeat with k from 1 to count allDocs
writeDoc(item k of allDocs, k)
end repeat
-- handler to write them to text files (could have been incorporated in the scheme above but didn't know if you wanted allDocs)
to writeDoc(someText, idx)
set f to open for access ((path to desktop as text) & "doc[" & idx & "].txt") with write permission
try
write someText to f
close access f
on error -- make sure it's closed or you can't trash it!
close access f
end try
end writeDoc
I was so focused on the characters that I failed to consider the words. Maybe one day I’ll be able to write such an elegant script. At least I am trying. I will implement this tomorrow
Welcome, Pete. I spent about 2 hours figuring out the heart of that first script; lots of false starts always coming back to the fact that you didn’t want to break words. I should also say that if your list runs to tens of thousands of words, the script can be speeded up very significantly, in several ways, but I was afraid they would confuse the main issue for you, since I presumed you would want to modify it and I didn’t want to spend too much longer getting that to work. The speeder-upper is to enclose variables that are large oft-visited lists in properties of a script within your script so that AppleScript keeps them as references to memory instead of lists to be broken down and rebuilt. Here’s one example in the last script in this article from our tutorial archives.
4 years later I need to modify this script slightly…
I need it to save each fixed length line of text as a paragraph in one file.
like this:-
I need it to save each
fixed length line of text
as a paragraph in one
file.
I have tried but can’t work it out
set LPL to "I need it to save each fixed length line of text as a paragraph in one file"
set allDocs to {}
set W to words of LPL
set k to 1
repeat
try
set P to characters k thru (k + 49) of LPL as string
if first character of P is space then set P to (characters 2 thru -1 of P) as text
if last character of P is space then set P to (characters 1 thru -2 of P) as text
if last word of P is not in W then -- we've split a word
set P to words 1 thru -2 of P
set AppleScript's text item delimiters to space
set P to P as text
set AppleScript's text item delimiters to ""
end if -- now find out where we left off
set k to (offset of (last word of P) in LPL) + (length of last word of P)
set end of allDocs to P -- stick it in our list of strings
on error -- we've hit the end of the original string
set end of allDocs to characters (k + 1) thru -1 of LPL as text
exit repeat
end try
end repeat
--> {"I need it to save each fixed length line of text as a paragraph in one file"}
I think this is what you want. (tested once and worked)
set LPL to "Jerzy Dudek Steve Finnan Sami Hyypia Daniel Agger John Arne Riise Harry Kewell Steven Gerrard Robbie Fowler Luis Garcia Mark Gonzalez Fabio Aurelio Xabi Alonso Peter Crouch Jermaine Pennant Craig Bellamy Dirk Kuyt Momo Sissoko Jamie Carragher Pepe Reina Paul Anderson Stephen Warnock Gabriel Paletta Boudewijn Zenden Danny Guthrie Adam Hammill Lee Peltier Craig Lindfield"
set allDocs to ""
set W to words of LPL
set k to 1
repeat
try
set P to characters k thru (k + 99) of LPL as string
if first character of P is space then set P to (characters 2 thru -1 of P) as text
if last character of P is space then set P to (characters 1 thru -2 of P) as text
if last word of P is not in W then -- we've split a word
set P to words 1 thru -2 of P
set AppleScript's text item delimiters to space
set P to P as text
set AppleScript's text item delimiters to ""
end if -- now find out where we left off
set k to (offset of (last word of P) in LPL) + (length of last word of P)
set allDocs to allDocs & P & return -- stick it in our list of strings
on error -- we've hit the end of the original string
set allDocs to allDocs & characters (k + 1) thru -1 of LPL as text
exit repeat
end try
end repeat
-- publish the docs
writeDoc(allDocs)
-- handler to write them to text files (could have been incorporated in the scheme above but didn't know if you wanted allDocs)
to writeDoc(someText)
set f to open for access ((path to desktop as text) & "doc_.txt") with write permission
try
write someText to f
close access f
on error -- make sure it's closed or you can't trash it!
close access f
end try
end writeDo
I’m sure that, with three and a half years’ more experience, Adam would now use more robust techniques.
set LPL to "Jerzy Dudek Steve Finnan Sami Hyypia Daniel Agger John Arne Riise Harry Kewell Steven Gerrard Robbie Fowler Luis Garcia Mark Gonzalez Fabio Aurelio Xabi Alonso Peter Crouch Jermaine Pennant Craig Bellamy Dirk Kuyt Momo Sissoko Jamie Carragher Pepe Reina Paul Anderson Stephen Warnock Gabriel Paletta Boudewijn Zenden Danny Guthrie Adam Hammill Lee Peltier Craig Lindfield"
set maxLen to 100 -- Maximum paragraph length.
set allParas to {}
set k to 1
set textLength to (count LPL)
repeat until (k > textLength)
set l to k + maxLen -- We'll be testing the (maxLen + 1)th character each time .
if (l > textLength) then set l to textLength -- . unless the end of the text comes before then.
set P to text k thru l of LPL
ignoring white space
if (P > space) then
if (P ends with space) or (l is textLength) then
set end of allParas to text 1 thru word -1 of P
else
set end of allParas to text 1 thru word -2 of P
end if
set k to k + (count result) + 1
else
set k to k + (count P)
end if
end ignoring
end repeat
set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to return
set allParas to allParas as text
set AppleScript's text item delimiters to astid
-- publish the docs
writeDoc(allParas)
-- handler to write them to text files (could have been incorporated in the scheme above but didn't know if you wanted allParas)
to writeDoc(someText)
set f to (open for access file ((path to desktop as text) & "doc_.txt") with write permission)
try
set eof f to 0
write someText to f
close access f
on error -- make sure it's closed or you can't trash it!
close access f
end try
end writeDoc