Help with simple find and replace script

Hello,
I am new to this forum and need some help writing an applescript that searches through a text file and replaces the 3rd tab on each line with the following text ", ". I realize this is probably a simple script for a scripter, however it is a daunting task for a nonscripting veterinarian.
Any advise or direction you can provide would be appreciated.
Thanks a bunch in advance.
Mike

Model: G5 dual 2 mHz
AppleScript: 1.10.3
Browser: Safari 417.8
Operating System: Mac OS X (10.4)

Dr. Mike:

No sweat! What kind of text files are we talking about here? Generic? Word? How about a few more details? Do you want to save the original files? A few more specifics would be greatly appreciated.

Craig Smith, DVM
Tacoma Cat Hospital
Tacoma, WA

Craig,

Thanks for your prompt response.
The file is generated by our practice management software and consists of a list of patient ID’s, patient names, date of birth, and referring DVM info.
The practice management software has the patient and client info in separate fields that are tab delineated in the output file. I need to combine the client and patient names into a single field for importation into another program on various different machines.

These files are generated as RTF files. I intended to do the editing in the text edit application. I need to take a file containing numerous lines like this:

24171 24171 Foreman Carbon True Dr. Chris Omoto 01/29/90

and convert to lines like this

24171 24171 Foreman, Carbon True Dr. Chris Omoto 01/29/90

The key is that the text

ClientName PatientName

needs to be replaced by the text

ClientName, PatientName

in other words I need to replace the 3rd tab on the line with a ", ". This needs to happen on all of the lines in the file.

The files should be saved ideally with a slightly modified file name (e.g. filename1) to ensure we don’t over right the original file if errors are found.

Thanks for the assistance.

Mike

Model: G5 dual 2 mHz
AppleScript: 1.10.3
Browser: Safari 417.8
Operating System: Mac OS X (10.4)

this is relatively easy to do with sed:

eg:

sed 's/     /, /3' /path/to/file.txt >>/path/to/convertedfile.txt

(note: sed doesn’t recognize the “\t” for a tab - so in a terminal window type: sed 's/ - then hit control-v - and then hit the tab key to get the tab in there)

this will simply replace the 3rd occurance of a tab to a ", " on each line. and then save that to a new file (convertedfile.txt)

However - rtf files present a more difficult problem - at least for a shell script. If you have the option to convert your files to plain text files - then this will work perfectly, but if you don’t - you’ll have to use a more elaborate solution.

Here’s a version that works in AppleScript:

set aRecord to "24171" & tab & "24171" & tab & "Foreman" & tab & " Carbon" & tab & " True" & tab & " Dr. Chris Omoto" & tab & " 01/29/90"
set {TID, text item delimiters} to {text item delimiters, tab}
set thisRecord to text items of aRecord
set text item delimiters to TID
set newRec to ""
repeat with k from 1 to 7
	if k = 3 then
		set newRec to newRec & (item k of thisRecord) & ","
	else if k = 7 then
		set newRec to newRec & (item k of thisRecord)
	else
		set newRec to newRec & (item k of thisRecord) & tab
	end if
end repeat
newRec
--> "24171	24171	Foreman, Carbon	 True	 Dr. Chris Omoto	 01/29/90"

Adam,
Thanks for the script. Maybe I did not make it clear, but the script I need should open a file containing 20 or 30 lines of unique text that is all formatted similarly. On each of these lines I need to replace the 3rd tab with the comma space combination. Is there a way to modify your script to read each line and then perform the established changes on a line by line basis?
Thanks
Mike

Chris,
Thanks for the assistance. I am completely unfamiliar with sed. I gather it is a terminal command rather than an applescript. I will investigate and as for further assistance when I know enough to ask an intelligent question.
Thanks again.
Mike

MIke:

Here is my suggestion. Sorry I could not post it sooner, I have been dealing with some nasty stomatitis cases today.’

set a to choose file

tell application "TextEdit"
	set fix_file to open a
	set para_old to every paragraph in front document
	close front document
end tell
set new_para to ""
repeat with this_para in para_old
	set loc4 to (offset of (fourth word of this_para) in this_para)
	set new_para to new_para & ((characters 1 thru (loc4 - 2) of this_para) & "," & (characters loc4 thru -1 of this_para)) as string
end repeat
tell application "TextEdit"
	set new_file to make new document at front
	set new_file's text to new_para
end tell

What this does is allows you to choose your original text file, and then it rebuilds each line, removing the third tab, which is placed just before the 4th word in each line. It then closes the original file, and writes a new one, with all the tabs preserved, and the last name and first name separated by a comma only, no space.

Let me know if this is closer to what you need.

here’s that same sed command integrated into an applescript so you can just run it without fooling around with the terminal: Remember this is only going to work with plain text files, NOT rtf.

set inputFile to (choose file with prompt "Please select a text file to convert:")
tell application "Finder"
	set outputFileName to (name of inputFile) & "_Converted.txt"
end tell
display dialog "Name OK?" default answer (outputFileName) buttons {"OK", "Cancel"} default button 1

set outputFile to (path to desktop folder) & outputFileName as string

do shell script "sed 's/	/, /3' " & (quoted form of POSIX path of inputFile) & space & ">>" & (quoted form of POSIX path of outputFile)

Chris:

It worked on the test RTF file I generated for this issue, and it worked nicely. It will be interesting to see if it works for Mike.

Craig,
Thanks for the prompt response. I am happy to say I actually understand what the script is supposed to do! I love the ability to manually select the desired file. Your description of how the script works matches my expectation as well.

Unfortunately I get the error message:

Can’t get word 4 of “”.

When I run the script. Any suggestions? Would it help if i sent you a copy of an actual file?

Thanks again.

Mike

Mike:

I would use Chris’s script first, and see if it works. UNIX is pretty mind boggling to see at first, but it is very, very good at text manipulation. Those that can wield it deserve to be heard.

As for my script, it is having difficulty with the blank lines in between the text filled lines. I had not thought of that. Here is a fix, in case you still want to try it:

set a to choose file

tell application "TextEdit"
	set fix_file to open a
	set para_old to every paragraph in front document
	close front document
end tell
set new_para to ""
repeat with this_para in para_old
	try
		set loc4 to (offset of (fourth word of this_para) in this_para)
		set new_para to new_para & ((characters 1 thru (loc4 - 2) of this_para) & "," & (characters loc4 thru -1 of this_para)) as string
	end try
end repeat
tell application "TextEdit"
	set new_file to make new document at front
	set new_file's text to new_para
end tell

I have not tested it, but it should work. By using the [try] statement, it will only proceed if there is no error. It will also not build any blank lines into your new text files, since it will only analyze those with words to see.

Good luck, let us know how this all turns out.

Chris,
The script ran successfully but only replaced the text on the first line. The remaining lines were left as is.
Thanks
Mike

Craig,
Your second script works like a charm. Thanks a million. I have learned a lot about AppleScript and a little about Unix with this effort. I will keep all of the scripts and try to play with them further to learn more about how they work.
Thanks again!
Mike

MIke:

Glad to hear it. Be sure to peruse our Books & Resources page if you ever get serious about wanting some Applescript or Unix texts.

And stop back by if you need anything else. As you can see, we have dozens of helpful people on the bbs; please DO NOT be shy about asking for assistance at any time. You are certain to learn a lot as you go, and you can bet that others are learning from your posts as well.

mrbroome - in response to your statement about the script replacing the tabs only on the first line - I think it might be becasue of your line-breaks. Macintosh line breaks are \r whereas unix line-breaks are \n My script will only work with the \n variety.

I think Craig already hooked you up with a good solution - so this post is mainly for anyone else out there who needs to know…

but if you want to make sure the line breaks are correct - you can do something like:

cat path/to/file | tr ‘\r’ ‘\n’ | sed …

the tr will just convert the mac line-breaks to unix ones…

Chris:

I don’t know about anyone else, but I was certainly wondering about it. Thanks for the update and clarification.

Chris,
Thanks for the ongoing interest in my problem and possible solutions.

As you mentioned Craig’s script works perfectly. Never-the-less, I would be interested in understanding where the line you referenced, namely

cat path/to/file | tr ‘\r’ ‘\n’ |

fits in the rest of the script. I am not trying to beat the proverbial dead horse here but I think two solutions are better than one.

Thanks for your continued interest.
Mike