Reprocessing Textfile to meet a certain Format requirement

I just started last week with this so i’m quite pretty new to Apple Script.

1st: What I have is a Database in Filemaker with the folowing data entries:
Rank: First Name Name Addr. Nbr. Street: ZIP City:
1 Ilres 146 rue du Cimetiere 1338 Luxembourg
2 Bibliotheque nationale c/o E. Noesen 437 bd F-D Roosevelt 2450 Luxembourg

2nd: I need to deliver a file which represents the same data but with different formatting:
Rank: First Name Name Addr. Nbr. Street: ZIP City:
15 chars/15 chars/32 chars/8 chars/50 chars/4 chars/36 chars (160 chars total)
something like this: "1 Ilres 146 rue du Cimetiere 1338Luxembourg
"
3rd: Well my script does it with the first Line of the document, now I want it to cycle through the paragraphs and append the new line’s to a new document so I will hopefully get my file done.

And last but not least if someone got an idea how to solve this directly in Filemaker 8 PRo I would appreciate the hint very much.

set a to choose file
tell application "TextEdit" to open a

set p to count paragraph of document 1 of application "TextEdit"
set p0 to 1
set finalText to ""


repeat until p0 = p
	set theText to get paragraph p0 of document 1 of application "TextEdit"
	
	set splitted to split(theText, "	", p0, finalText)
	set p0 to p0 + 1
	
	return splitted
end repeat

--tell application "TextEdit"
--	set new_file to make new document at front
--	set new_file's text to finalText as string
--end tell

--split(theText, "	") --> Split action
--to split(parText, delimiter)
on split(parText, delimiter, p0, finalText)
	set AppleScript's text item delimiters to delimiter
	set parText to parText's text items
	set AppleScript's text item delimiters to {""} --> restore delimiters to default value
	
	-- Space check 1
	set i to 1
	
	repeat while not i ≥ 8
		
		if i = 1 or i = 2 then
			set itemLen to 15
		else
			if i = 3 then
				set itemLen to 32
			else
				if i = 4 then
					set itemLen to 8
				else
					if i = 5 then
						set itemLen to 50
					else
						if i = 6 then
							set itemLen to 4
						else
							if i = 7 then
								set itemLen to 36
							else
								return "error"
							end if
						end if
					end if
				end if
			end if
		end if
		
		set theTruth to (item i of parText)
		set count1 to count theTruth
		
		repeat until (count1) = itemLen
			set theTruth to theTruth & space
			set count1 to count theTruth
		end repeat
		
		set count1 to theTruth
		
		set (item i of parText) to theTruth
		
		set i to i + 1
		
	end repeat
	
	if p0 = 1 then
		set finalText to parText as string
	else
		if p0 > 1 then
			set finalText to finalText & finalText & parText as string
		else
			return "error"
		end if
	end if
	
	--tell application "TextEdit"
	--	set new_file to make new document at front
	--	set new_file's text to parText as string
	--end tell
	
end split

Model: Powerbook G4 17"
AppleScript: 1.10.3
Browser: Firefox 1.5.0.1
Operating System: Mac OS X (10.4)

Hi despite,

is your text file tab delimited?

maybe you can try this:


set a to choose file
tell application "TextEdit" to open a

set theText to text of document 1 of application "TextEdit"
set newText to (do shell script "echo " & quoted form of theText & " | tr \\n \\r | awk -F\\t '{printf\"%15d\\t%15s\\t%32s\\t%8s\\t%50s\\t%4s\\t%36s\\r\", $1, $2, $3, $4, $5, $6, $7}' ")

tell application "TextEdit"
	set newDoc to make new document
	set text of newDoc to newText
	save newDoc in (choose file name)
end tell

D.

sorry- my mistake - the first attempt would save the converted text as .rtf

try this:

set a to choose file
tell application "TextEdit" to open a

set theText to text of document 1 of application "TextEdit"
set text of document 1 of application "TextEdit" to (do shell script "echo " & quoted form of theText & " | tr \\n \\r | awk -F\\t '{printf\"%15d\\t%15s\\t%32s\\t%8s\\t%50s\\t%4s\\t%36s\\r\", $1, $2, $3, $4, $5, $6, $7}' ")

tell application "TextEdit" to save document 1 in (choose file name default name "outfile.txt")

Thanks Dominik :wink: for the fast reply

I just tried your Script. To answer your question , yes the text is Tab delimited and each row seems to end with a return. (which can be replaced easily if it makes any problem’s)

Well you’r using the shell to format the Text, which worked only partially.
Result of your script (start’s with spaces then follow’s the information):
" 1 Ilres 146 rue du Cimetiere 1338 Luxembourg"
I need it in this format:
"1 Ilres 146 rue du Cimetiere 1338Luxembourg
"

1st i need to keep my data at the begining of the text, the spaces only serve as filler to reach the required nbr. of Characters per Field.
Is there a possibility to accomplish this using shell script?
2nd: the script seems to work only on the first paragraph (doesn’t proceed through the whole document)

here’s part of the text document i’m working on:

{"1		Ilres	146	rue du Cimetiere	1338	Luxembourg
2		HINCK Olivier	13	Bousserstrooss	9144	Hierheck
3		Cox	15	Bousserstrooss	9144	Hierheck
4		KREIN-MATHIEU Marco	16	Martelerstrooss	9144	Hierheck
5		MATHONY-KAYSER Jos	18	Martelerstrooss	9144	Hierheck
6		WEIBEL-FEYEN Armand	19	Martelerstrooss	9144	Hierheck
7		MERTENS-REIFENBERG Martin	12	a Klatzber	9150	Eschdorf
8		MAYER-REINERT Charles	16	a Klatzber	9150	Eschdorf
9		DEPREZ-LUTUN Philippe	18	a Klatzber	9150	Eschdorf
10		Weintzen Soeurs	19	a Klatzber	9150	Eschdorf
"}

K the problem with the allignment of the final text is solved :wink:

set a to choose file
tell application "TextEdit" to open a

set theText to text of document 1 of application "TextEdit"
set newText to (do shell script "echo " & quoted form of theText & " | tr \\n \\r | awk -F\\t '{printf\"%-15d\\t%-15s\\t%-32s\\t%-8s\\t%-50s\\t%-4s\\t%-36s\\r\", $1, $2, $3, $4, $5, $6, $7}' ")

tell application "TextEdit"
   set newDoc to make new document
   set text of newDoc to newText
   save newDoc in (choose file name)
end tell

had to be negative values in the Script as u can see above.

Only the problem with processing the whole document remaining :wink:

hi despite,

what do you mean with ‘paragraph’?
I thought a paragraph in text is delimited by a line break - does it mean the script works only on the first line for you?

D.

For now the script only works on the first line ;(
Each line is delimited by a LineBreak , even if i replace those with Tab’s it doesn’t proceed any further than the first 7 terms.

Maybe u got any idea how to solve this??

hmm - might be a problem with different line breaks - try this, maybe it helps?:

set a to choose file
tell application "TextEdit"
	open a
	set theText to ""
	tell document 1
		repeat with thisP in paragraphs
			set theText to theText & return & thisP
		end repeat
	end tell
end tell

set newText to (do shell script "echo " & quoted form of theText & " | tr \\n \\r | awk -F\\t '{printf\"%-15d\\t%-15s\\t%-32s\\t%-8s\\t%-50s\\t%-4s\\t%-36s\\r\", $1, $2, $3, $4, $5, $6, $7}' ")

tell application "TextEdit"
set newDoc to make new document
set text of newDoc to newText
save newDoc in (choose file name)
end tell

Thx again for the hint with the different Linebreaks “had to convert em to UNIX Line Endings” went realy fast thanks to the built in function of “SubEthaEdit”.

Btw. had to remove the TABS from the script too, by changing the “t%-32s → %-32s” etc.

set a to choose file
tell application "TextEdit" to open a

set theText to text of document 1 of application "TextEdit"
set newText to (do shell script "echo " & quoted form of theText & " | tr \\n \\r | awk -F\\t '{printf\"%-15d\\%-15s\\%-32s\\%-8s\\%-50s\\%-4s\\%-36s\\r\", $1, $2, $3, $4, $5, $6, $7}' ")

tell application "TextEdit"
set newDoc to make new document
set text of newDoc to newText
save newDoc in (choose file name)
end tell

After solving that prob I encountered another one “the whole database was to big (approx. 700 kB) to be processed at once (tryed the Applescript only on some Lines)”. After splitting it in some parts the Applescript finally did it’s job quite fast.

Which let’s me now with a nearly perfect File if there wasn’t that little prob with “awk” changing some letters ?!? Think it’s due to a different File Encoding “Unicode / OSX…”.

Script changes “an der” to “ar der”

– K arrived here I sensed myself only a step away from the perfect solution –

Is there a way to process bigger files (such as 1.2 MB) at once?
I heard there’s a way to convert files to different File Encoding’s? Could need some help on that one too, maybe someone encountered the same prob already.

I guess 700 kB or more text might be too large to hold in AppleScript variables :wink:
Well probably it’s better then when you don’t do it with AppleScript at all.
I suggest using a ‘pure’ shell script (sorry for becoming off topic :wink: ).

cat infile.txt | awk -F\t ‘{printf “%-15d%-15s%-32s%-8s%-50s%-4s%-36s\r”, $1, $2, $3, $4, $5, $6, $7}’ > outfile.txt

You have to make sure, that your terminal’s settings for character encoding meet the settings of your Unicode text - then there should not occur any problems with wrong characters.

This can be done with SimpleText or BBEdit for example.
Simpletext: (convert to ‘pure text’ if it’s rtf format) then you can choose the character encoding in the save dialog
similar in BBEdit

You can easily find out the settings of your textfile in Terminal:
file path/to/my/mytextfile.txt
→ mytextfile.txt: UTF-8 Unicode text
(for example …)

D.

Well even if it was a bit off topic :wink: it worked out perfectly for me :wink: (even tose little character encoding errors stopped to occur)

Many Thanks to u Dominik Big Hug