script to create text files from delimited sections of one original?

Hello,

I would like to figure out how I can take one text file (the output of a data merge) and automatically process it into multiple files, each containing the appropriate delimited content. For instance, if the main original text file had the following content:

[break-into-file-with-name:2007-04-05-smith-john.txt]
contents of first file filler text here abc
[break-into-file-with-name:2007-04-05-jones-bill.txt]
contents of second file more filler text xyx
[break-into-file-with-name:2007-04-12-smathers-donna.txt]
contents of third file …
[break-into-file-with-name:2007-04-12-jeremiah-ronald.txt]
contents of nth file …

then the post scripting result would be four separate files:

2007-04-05-smith-john.txt
2007-04-05-jones-bill.txt
2007-04-12-smathers-donna.txt
2007-04-12-jeremiah-ronald.txt

the sole contents of 2007-04-05-smith-john.ics would be “contents of first file filler text here abc”, to sole contents of 2007-04-05-jones-bill.txt would be “contents of second file more filler text xyx”, etc.

Unfortunately I cannot create this script myself, as I’m a newbie. I would really appreciate some help!

Thank you,
Erik

Model: PowerBook G4
AppleScript: 1.10.7
Browser: Firefox 2.0.0.3
Operating System: Mac OS X (10.4)

Erik:

Is what you posted an exact copy of what is in the file? For this to operate correctly, a pattern needs to be established that repeats every time you need the new file. If your text file indeed contains this string:

[break-into-file-with-name:

in front of every new text file name and subsequent contents, then this will be a snap.

Hi Craig,

Thank you for your post. What I posted is illustrative of what the file will contain. I have complete control over the contents because I’ll be creating the file from a data merge that I will create. So…yes, the file will have a distinct delimiting pattern that can separate the ‘sections’ for each file and identify what the filename of each file should be.

A bit more info about how I envision the entire work-flow, just in case it puts things in context:

  • merge data/text from spreadsheet into word document (as these merges are easy from xls to doc)
  • copy text from word document into textwrangler document (to make simple text that can be used in cron)
  • run script on textwrangler document to turn one file into many files
  • upload those new text files to my webserver
  • set cronjob to email contents of the various files to specific people at specific times (e.g. set cron to run this shell script at the right date/time: cat file001.txt | mail -s “Email Reminder” person001@fakeemail.com)
    (crontab contents also to be created via data merge and uploaded)

Do you think you can help me create an applescript that will break the big text files into a bunch of small text files that can be called by the cron after I upload them to my webserver?

Thank you,
Erik

Erik:

I think so. I have never used (or scripted) Text Wrangler, I am a TextEdit kind of guy. Anyway, based on your earlier example, this script will separate the data into 2 item lists, where the first item is the name of the file, and the second item is the contents of the file:

set txt_file to "[break-into-file-with-name:2007-04-05-smith-john.txt]
contents of first file filler text here abc
[break-into-file-with-name:2007-04-05-jones-bill.txt]
contents of second file more filler text xyx
[break-into-file-with-name:2007-04-12-smathers-donna.txt]
contents of third file ...
[break-into-file-with-name:2007-04-12-jeremiah-ronald.txt]
contents of nth file ..."

set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to "[break-into-file-with-name:"
set txt_Pieces to txt_file's every text item
set AppleScript's text item delimiters to astid
set file_Info to {}
repeat with an_Item in txt_Pieces
	if an_Item contains "]" then
		set split_Point to offset of "]" in an_Item
		set end of file_Info to {(text 1 thru (split_Point - 1) of an_Item) as string, (text (split_Point + 2) thru -2 of an_Item) as string}
	end if
end repeat
-->{{"2007-04-05-smith-john.txt", "contents of first file filler text here abc"}, {"2007-04-05-jones-bill.txt", "contents of second file more filler text xyx"}, {"2007-04-12-smathers-donna.txt", "contents of third file ..."}, {"2007-04-12-jeremiah-ronald.txt", "contents of nth file .."}}

Is this pretty close to what you are looking for?

Thank you for your continued help Craig.

I clicked on “Open this scriptlet in your editor” and compiled and ran the script. I didn’t get an error but I’m not exactly sure what happened.

Here’s what I’d ideally like to happen:

  • user (me) runs script
  • script asks user to browse to select text file to process (this text file will be the output of the data merge that I mentioned previously)
  • user selects text file and clicks ‘process’
  • script takes the big text file and converts it to a bunch of little text files, each with a name that is shown between “[break-into-file-with-name:” and “]” and with a body of the text that falls after the “]” but before the next “[break-into-file-with-name:”

Note: I can change the delimiting scheme if that would be helpful. e.g.
[Start File]
[Filename: xxxx.txt]
body body body adsfa
[End File]
[Start File]
[Filename: yyyy.txt]
body body body adsfa other fake text nah nah
[End File]
[Start File]

Does that make any sense?

Thank you,
Erik

Erik:

Sure it does. I think I may have assumed a bit much before; I am sorry. Try this out and see what happens:

set txt_file_LOC to choose file with prompt "Select a file to process:"
set dest_folder to (choose folder with prompt "Select a folder for the newly created files:") as Unicode text
set txt_file to read txt_file_LOC
set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to "[break-into-file-with-name:"
set txt_Pieces to txt_file's every text item
set AppleScript's text item delimiters to astid
set file_Info to {}
repeat with an_Item in txt_Pieces
	if an_Item contains "]" then
		set split_Point to offset of "]" in an_Item
		set end of file_Info to {(text 1 thru (split_Point - 1) of an_Item) as string, (text (split_Point + 2) thru -2 of an_Item) as string}
	end if
end repeat
repeat with file_item in file_Info
	set new_file to (dest_folder & (file_item's item 1 as string))
	set write_file to open for access file new_file with write permission
	write (file_item's item 2 as string) to write_file
	close access write_file
end repeat

You should be prompted to select the file for processing and then a destination folder. As long as the data in selected file is exactly like you had in your first post, it should work. If not, there will be a file in whatever folder you chose entitled CASDVM. Email that file to me, along with the original text file (or the first part of it) and I should be able to find the issues.

It worked great.
Thank you very much Craig!
Btw - what is your role at Applescript.net?

I maintain the Books & Resources section, and try to take care of the AppleScript Links pages as well. I have also put together a series of beginner tutorials on AppleScript, which can be found on our unScripted pages.

Glad it all worked out. Just how many files did we create? Did it seem to go pretty fast?

Hi Craig,

I need to check out those beginner tutorials!

My initial usage of the script was on one file that the script split into 224 relatively small text files. The script worked very quickly and the resulting files worked flawlessly when called by the cronjobs and shell scripts. I just sent a donation via the site’s donation page because I’m so thankful for and impressed by your help.

By the way - I have an applescript that I’d like to share. It creates folders in Apple Mail from a list on the clipboard that was copied (or cut) from a list in Excel. If I send the script to you can you post it in the appropriate place? If any other newbies want that functionality I’d hate to have them reinvent the wheel. (I did a fair bit of searching for existing solutions before paying a Rent-a-Coder $10 to create it).

All the best,
Erik

Actually, Erik, the best place to post it would be right here on MacScripter at the ScriptBuilders Page. Check it out and let me know if you need any help posting it.

Hi Craig,

I changed the code so it just looks for “[Filename:NAME]” instead of the more cumbersome delimiter I posted before. Here it is the updated script just in case anyone else might need it:


set txt_file_LOC to choose file with prompt "Select a file to process:"
set dest_folder to (choose folder with prompt "Select a folder for the newly created files:") as Unicode text
set txt_file to read txt_file_LOC
set astid to AppleScript's text item delimiters
set AppleScript's text item delimiters to "[Filename:"
set txt_Pieces to txt_file's every text item
set AppleScript's text item delimiters to astid
set file_Info to {}
repeat with an_Item in txt_Pieces
	if an_Item contains "]" then
		set split_Point to offset of "]" in an_Item
		set end of file_Info to {(text 1 thru (split_Point - 1) of an_Item) as string, (text (split_Point + 2) thru -2 of an_Item) as string}
	end if
end repeat
repeat with file_item in file_Info
	set new_file to (dest_folder & (file_item's item 1 as string))
	set write_file to open for access file new_file with write permission
	write (file_item's item 2 as string) to write_file
	close access write_file
end repeat

I plan to post that Mail mailbox script, too.

All the best,
Erik

Model: PowerBook G4
AppleScript: 2.1.1
Browser: Firefox 2.0.0.3
Operating System: Mac OS X (10.4)