Help remove <return> characters from TextEdit text file.

I have a text file created in TextEdit. There are a bunch of extraneous characters between the paragraphs in the file. I need a script that will get rid of the extra , leaving just one between the paragraphs.

Here’s the script I tried. I tried it with both “set ChR to ASCII character 13” and “set ChR to ASCII character 10”
Basically I’m trying to replace 3 with 2, looping through until it doesn’t find 3 any more.


set deskFolder to path to desktop folder as string
set fileName to "help_textedit.txt"
set theFileToRead to deskFolder & fileName
set file_to_read to read file theFileToRead


set ChR to ASCII character 13
set search_string to ChR & ChR & ChR
set replacement_string to ChR & ChR

repeat
	if file_to_read contains search_string then
		set AppleScript's text item delimiters to search_string
		set the text_item_list to text items of file_to_read
		set AppleScript's text item delimiters to replacement_string
		set the file_to_read to the text_item_list as string
		set AppleScript's text item delimiters to ""
	else
		exit repeat
	end if
end repeat

get file_to_read

Hi,

the problem, according to me, is that there are 2 ascii characters that represent a return character. The returns that are used by pressing the enter in TextEdit are of the type ascii character 10 (a linefeed), not ascii character 13 (a carriage return). So I think this should work:


set deskFolder to path to desktop folder as string
set fileName to "testingIt.txt"
set theFileToRead to deskFolder & fileName
set file_to_read to read file theFileToRead

set ChR to ASCII character 10
set search_string to ChR & ChR & ChR
set replacement_string to ChR & ChR

repeat
	if file_to_read contains search_string then
		set AppleScript's text item delimiters to search_string
		set the text_item_list to text items of file_to_read
		set AppleScript's text item delimiters to replacement_string
		set the file_to_read to the text_item_list as string
		set AppleScript's text item delimiters to ""
	else
		exit repeat
	end if
end repeat

get file_to_read

Greetz

Thanks for your replay caenel. I think I’m on to the problem. I’ve just been searching the internet about text files and CR vs LF. I just found out about the “windows” way of using both, i.e. CRLF. A quick test on my file seems that this is indeed the type of line endings in this file. So I tried combining the two like:

set ChR to ASCII character 13
set LhF to ASCII character 10
set CRLF to ChR & LhF
set search_string to CRLF & CRLF & CRLF
set replacement_string to CRLF & CRLF

I think it will work. I’m rewriting my script, which is huge, to see if this fixes the problem. After pulling my hair out all day on this, I think I’m finally onto the right solution.

I’m basically writing a script which needs to modify text files from all over the place, so I guess I have to make my script limber enough to handle any of the 3 line endings… CR, LF, or CRLF.

Hello

You may use this piece of code

set LF to ASCII character 10

if file_to_read contains return & LF then
	set eol to return & LF
else if file_to_read contains LF then
	set eol to LF
else
	set eol to return
end if

set replacement_string to eol & eol
set search_string to replacement_string & eol

so your script will be able to deal with the three described line endings.

Yvan KOENIG (from FRANCE mardi 5 décembre 2006 19:31:36)

I like that Yvan. I will use it, thanks.

This should also work:

read ((path to desktop as Unicode text) & "testingIt.txt")
set fileText to paragraphs of result

set ASTID to AppleScript's text item delimiters
set AppleScript's text item delimiters to {ASCII character 10}
set fileText to "" & fileText
set AppleScript's text item delimiters to ASTID

return fileText -- Do whatever you want with `fileText`

Alternate version of Yvan’s idea:

set file_to_read to "a" & (ASCII character 13) & (ASCII character 10) & "b"
set eol to ""

tell (return) to if file_to_read contains it then set eol to eol & it
tell (ASCII character 10) to if file_to_read contains it then set eol to eol & it

count eol --> 2

Bruce, are you saying that with your code I can read in any text (with any line endings), then once I start working with the result of your code i.e. return fileText, that I’l only have to deal with {ASCII character 10} line endings in the rest of my code?

In other words your code changes all the line endings to {ASCII character 10} line endings?

Correct.