Deleting blank lines in an xml document

I am in search of a script to find and delete all blank lines from an xml document in TextWrangler. For example;

Starting with;

<?xml version="1.0"?>
<format-control-parameters
	
	substitution-table="17029"

to to return;

<?xml version="1.0"?> <format-control-parameters substitution-table="17029"

Can anyone point me in the right direction please?

N.B. the line spacing may sometimes be irregular in the document.

David

There’s nothing tricky about this. An xml file is just like any other text file so don’t worry that it’s an xml file. Just read it into applescript just like any other text file then you can manipulate the text any way you like.

set theText to read (choose file without invisbles)

Now that you have the text use text item delimiters…

set applescript’s text item delimiters to return & return
set theTextList to text items of theText
set applescript’s text item delimiters to return
set newText to theTextList as text
set applescript’s text item delimiters to “”

See how text item delimiters was used to replace all double return characters with single return characters… so you just write the newText back to a file with an xml extension. Note, some files use the unix new line character instead of the return character, so in that case just replace the returns with “ascii character 10”.

Thank you very much for the reply, Hank. I forgot to mention that I am a complete novice with AppleScript, and could not get your suggestion to work (I possibly needed to write some additional code for the the script to work in its entirety).

I was pointed in a different direction in another forum, and was able to come up with 2 Applescripts for use in conjunction with TextWrangler;

Delete Blank Lines;

tell application "TextWrangler"
	activate
	replace "\\r\\r" using "\\r" searching in text 1 of text document 1 options {search mode:literal, starting at top:false, wrap around:false, backwards:false, case sensitive:false, match words:false, extend selection:false}
end tell

Delete Blank Lines_recursive (for every document in the TextWrangler Document Drawer);

tell application "TextWrangler"
	activate
	replace "\\r\\r" using "\\r" searching in every text document options {search mode:literal, starting at top:false, wrap around:false, backwards:false, case sensitive:false, match words:false, extend selection:false}
end tell

Many thanks, and I see I will have to purchase a book on scripting for beginners. I am finding scripting quite fun and addictive (as well as being a productivity booster).

David

I’m glad you found a solution with TextWrangler. I got addicted to applescript just like you then. It starts out with one small project and it’s so cool that you just have to learn more. Good luck with that. The people in this forum are very helpful and this website has many tutorials you can learn from. Check out the unscripted section here: http://macscripter.net/viewforum.php?id=31. There’s a series of articles called “applescript for beginners” that I learned from.

You don’t really need any application to do this job for you though. Applescript can handle the job itself. I finished that script so if you would rather use that then here it is. One advantage of this is that it will remove all multiple return characters and replace them with a single one. So if your file had 5 return characters in a row it would fix that too. The TextWrangler scripts are written to only find instances of 2 but not more than that. Anyway, give it a try and good luck.

-- this script will remove all double-returns from a text file
-- it checks for both the mac return character and the unix new line character

-- first you are asked to select the text file
-- next you are asked where to save the fixed text file

-- choose the text file
set theText to read (choose file without invisbles)
set origText to theText

-- remove the double return characters
repeat while theText contains (return & return)
	set AppleScript's text item delimiters to return & return
	set theTextList to text items of theText
	set AppleScript's text item delimiters to return
	set theText to theTextList as text
end repeat

-- remove the double unix new line characters
set LF to ASCII character 10
repeat while theText contains (LF & LF)
	set AppleScript's text item delimiters to LF & LF
	set theTextList to text items of theText
	set AppleScript's text item delimiters to LF
	set theText to theTextList as text
end repeat
set AppleScript's text item delimiters to ""

if theText is equal to origText then
	display dialog "Nothing needed to be fixed in the original file so no changes will be saved." buttons {"OK"} default button 1 with icon note
else
	-- write the fixed text to disk
	set newTextPath to choose file name with prompt "Where would you like to save the fixed file?" default name "untitled.xml"
	writeTo(theText, newTextPath, false, string)
end if


(*============= SUBROUTINES ============*)
on writeTo(this_data, target_file, append_data, mode) -- append_data is true or false, mode is string etc. (no quotes around either)
	try
		set target_file to target_file as Unicode text
		if target_file does not contain ":" then set target_file to POSIX file target_file as Unicode text
		set the open_target_file to open for access file target_file with write permission
		if append_data is false then set eof of the open_target_file to 0
		write this_data to the open_target_file starting at eof as mode
		close access the open_target_file
		return true
	on error
		try
			close access file open_target_file
		end try
		return false
	end try
end writeTo

Many, many thanks, Hank, your script worked perfectly.

I will be sure to read up in the unscripted forum also :slight_smile:

Thank you again,

David