How to insert lines at start/ending of TextWrangler document

I’ve created a script to process TMX (= XML) files in TextWrangler. I’ve used some ‘clumsy’ ways to add some lines at the start and ending of the resulting file. See: – Fill these 5 lines with the TMX header

I was wondering, what would be the ‘elegant’ way?

And also: what is the best way to represent the line feed? I now type “\n” and this works. However, when the script is saved in the editor I get an ‘ugly’ word wrap (in the part – Fill these 5 lines with the TMX header).


-- Release: 2016-10-20
-- Purpose: Create a compacted version of a TMX file

tell application "TextWrangler"
	tell front text document
		-- Remove line breaks at segment start
		replace "<seg>\\n" using "<seg>" options {search mode:grep, starting at top:true}
		-- Remove line breaks at segment ending
		replace "\\n</seg>" using "</seg>" options {search mode:grep, starting at top:true}
		-- Conversion from TMX to tab-del
		-- Insert tab characters between source and target segments
		replace "<\\/seg.*?seg>" using "	" options {search mode:grep, starting at top:true}
		-- Remove segment ending markup
		replace "</seg></tuv></tu>" using "" options {starting at top:true}
		-- Remove closing body markup
		replace "</body>" using "" options {starting at top:true}
		-- Remove closing TMX markup
		replace "</tmx>" using "" options {starting at top:true}
		-- Remove any TMX header
		--replace "<\\?[\\w\\W]*<body>\\r" using "" options {search mode:grep, starting at top:true}
		-- Remove segment start markup
		replace "<tu.*?seg>" using "" options {search mode:grep, starting at top:true}
		-- Start cleaning up the tab-del
		-- Remove numbers
		replace "\\d+" using "0" options {search mode:grep, starting at top:true}
		-- Remove punctuation characters
		replace "[!?ž""'˜\"]" using "" options {search mode:grep, starting at top:true}
		-- Remove HTML entities
		replace "&.*?;" using "" options {search mode:grep, starting at top:true}
		-- Replace non-breaking space with normal spaces
		replace "\\x{A0}" using " " options {search mode:grep, starting at top:true}
		-- Reduce space sequences to single spaces
		replace "[ ]{2,}" using " " options {search mode:grep, starting at top:true}
		-- Remove spaces at segment start
		replace "\\r[ ]" using "\\r" options {search mode:grep, starting at top:true}
		replace "\\t[ ]" using "\\t" options {search mode:grep, starting at top:true}
		-- Remove spaces at segment ending
		replace "[ ]\\r" using "\\r" options {search mode:grep, starting at top:true}
		replace "[ ]\\t" using "\\t" options {search mode:grep, starting at top:true}
		-- Remove duplicate lines
		process duplicate lines duplicates options {match mode:leaving_one} output options {deleting duplicates:true}
		-- Delete lines where source=target
		process lines containing matching string "^(.*?)\\t\\1" matching with grep true ¬
			output options {deleting matched lines:true}
		-- Delete lines without any letter
		process lines containing matching string "^((?![A-z]).)*$" matching with grep true ¬
			output options {deleting matched lines:true}
		-- Delete lines without a TAB
		process lines containing matching string "^[^\\t]*?$" matching with grep true ¬
			output options {deleting matched lines:true}
		-- Add one line break at the start of the document
		replace "\\A(.)" using "\\n\\1" options {search mode:grep, starting at top:true}
		-- Add segment markup step 1 start and endings
		replace "\\n" using "<\\/seg><\\/tuv><\\/tu>\\n<tu><tuv xml:lang=\"de-DE\"><seg>" options {search mode:grep, starting at top:true}
		-- Add segment markup step 2 tab characters
		replace "\\t" using "<\\/seg><\\/tuv><tuv xml:lang=\"nl-NL\"><seg>" options {search mode:grep, starting at top:true}
		-- Insert 5 empty lines at the start of the file
		set first line to "one" & "
" & "two" & "
" & "three" & "
" & "four" & "
" & "five"
		-- Fill these 5 lines with the TMX header
		set first line to "<?xml version=\"1.0\" encoding=\"utf-8\"?>"
		set second line to "<tmx version=\"1.4\">"
		set third line to "<header datatype=\"plaintext\" segtype=\"sentence\" adminlang=\"EN-US\" srclang=\"de-DE\"><note>size=4</note>"
		set fourth line to "</header>"
		set fifth line to "<body>"
		-- Replace the last line of the file with the TMX closing markups
		set last line to "</body>" & "
" & "</tmx>"
		
		save to ((path to desktop folder) as text) & "Compacted memory.tmx"
	end tell
end tell

For years we may use the predefined variable linefeedwhen we want to insert a ; linefeed.

set myHeader to "This" & linefeed & "is" & linefeed & "five" & linefeed & "lines" & linefeed & "header"

Yvan KOENIG running Sierra 10.12.0 in French (VALLAURIS, France) jeudi 20 octobre 2016 15:42:38