convert text file from PC to Mac (CRLF to LF ?)

how would you go about doing this ? can use bbedit in fact would prefer that
I tryed to “normalize line endings” but I just get an error on that

thanks
mm

Try something like this:

-- Example
choose file without invisibles
set test to read result

set ASTID to AppleScript's text item delimiters
set AppleScript's text item delimiters to return -- ASCII character 13
set test to "" & paragraphs of test
set AppleScript's text item delimiters to ASTID

test

Bruce thanks for that unfortunately that is going to take way to long 18 MB text file to process I need something a little more robust :slight_smile: bbedit does a great job at this kind of stuff I am doing some string subs already and the are light years faster then the alternative applescript solution so I’m hoping to find a way to do this using bbedit.

again thank you

mm

I disagree that an AppleScript approach will take much longer than having BBEdit do this. The idea here is to read the text into a script variable (so it’s held in memory) and then act on that.


script T
	property myText : missing value
end script
choose file without invisibles
set T's myText to read result
set CR to ASCII character 13
set CRLF to CR & (ASCII character 10)
set myText to findAndReplace(CRLF, CR, T's myText)

-- Nigel Garvey's find/replace handler
-- returns class of original.
on findAndReplace(toFind, replaceWith, theText)
	set ASTID to AppleScript's text item delimiters
	set AppleScript's text item delimiters to toFind
	set textItems to theText's text items
	set AppleScript's text item delimiters to replaceWith
	tell textItems to set editedText to beginning & replaceWith & rest
	set AppleScript's text item delimiters to ASTID
	return editedText
end findAndReplace

I got an out of memory error trying to run that. I have 2 gigs of ram on my machine so I’m not sure how that works…

mm

ok I figured out how to do it in bbedit. as long as you have your default line breaks set up as “Mac (CR)” then you can do this


set filePath to choose file
tell application "BBEdit"
	open filePath with LF translation
	-- do stuff
end tell

at least I think this is working I’m not sure how to tell ?

mm

Turn on “Show Invisibles” in the Edit>Text Options sheet.

correction


tell application "BBEdit"
	tell text document 1
		set line breaks to Mac
	end tell
end tell

read/write on the fly with reference numbers is pretty quick.


set f to choose file
set lf to (ASCII character 10)
set new_file to ((path to desktop as string) & "new.txt") as file specification
set ref1 to open for access f
set ref2 to open for access new_file with write permission
try
	repeat
		set t to read ref1 before lf
		write t to ref2
	end repeat
on error
	close access ref1
	close access ref2
end try

This avoids out of memory errors.

gl,

Kel,

That works great even on the 18 MB text file it took two seconds if that…

I’m unclear about how it know what it is repeating with and I’m not sure how I would incorperate a find and replace at the same time it seem applescript my do that faster if it is done this way

mm

Hi,

Come to think of it if I’m remembering right, reading in blocks was way faster. Like in 10MB blocks. Nigel Garvey posted some quick scripts that avoids out of memory errors. I have to do some review.

gl,

Something like this is way faster I think. It reads in 1MB blocks instead of paragraphs. You could still get stack overflow errors if the paragraphs are very small.


set f to choose file
set lf to (ASCII character 10)
set dos_ending to return & lf
set block_size to 1024 -- 1KB
set new_file to ((path to desktop as string) & "new.txt") as file specification
set ref1 to open for access f
set ref2 to open for access new_file with write permission
try
	repeat
		set t to read ref1 for block_size
		if t ends with return then -- move the byte marker 1B
			read ref1 for 1
		end if
		set t to ReplaceText(t, dos_ending, return)
		write t to ref2
	end repeat
on error err_msg
	close access ref1
	close access ref2
end try
--
-- searches text t for string s and replaces with string r
on ReplaceText(t, s, r)
	set utid to AppleScript's text item delimiters
	set AppleScript's text item delimiters to {s}
	set temp_list to text items of t
	set AppleScript's text item delimiters to {r}
	set temp_text to temp_list as string
	set AppleScript's text item delimiters to utid
	return temp_text
end ReplaceText

In the other script I posted, it reads a dos paragraph before the linefeed and writes that to the new file. When the byte marker is at the end of file, it errors when read and closes the file. There is no replacing needed because it’s not reading the linefeeds. This script, on the other hand, does a search and replace of the blocks read. If it’s still too slow, you might be able to speed it up by using a reference to the list in the ReplaceText subroutine.

gl,
gl,

Kel this is good stuff thank you.!

I modified the script to what I am working on. it returns a time of 29 which is much better than previous attempts


set f to choose file
set FRPrefs to choose file
set FRPrefs to read FRPrefs as text
set StartTime to current date
set LF to (ASCII character 10)
set dos_ending to return & LF
set block_size to 1024 -- 1KB
set new_file to ((path to desktop as string) & "new.txt") as file specification
set ref1 to open for access f
set ref2 to open for access new_file with write permission
try
	repeat
		set T to read ref1 for block_size
		if T ends with return then -- move the byte marker 1B
			read ref1 for 1
		end if
		set T to replacetext(T, dos_ending, return)
		repeat with i from 1 to count of paragraphs in FRPrefs
			set apara to paragraph i of FRPrefs
			set apara to my makeList(apara, "|")
			set fText to item 1 of apara
			set rText to item 2 of apara
			set T to replacetext(T, fText, rText)
		end repeat
		write T to ref2
	end repeat
on error err_msg
	close access ref1
	close access ref2
end try
set EndTime to current date
set SubRoutineTimeDuration to (EndTime - StartTime)
display dialog SubRoutineTimeDuration
--
-- searches text t for string s and replaces with string r
on replacetext(T, s, r)
	set utid to AppleScript's text item delimiters
	set AppleScript's text item delimiters to {s}
	set temp_list to text items of T
	set AppleScript's text item delimiters to {r}
	set temp_text to temp_list as string
	set AppleScript's text item delimiters to utid
	return temp_text
end replacetext

on makeList(astring, asep)
	set tid to AppleScript's text item delimiters
	set AppleScript's text item delimiters to asep
	set thelist to {}
	repeat with i from 1 to count of text items in astring
		copy text item i of astring to end of thelist
	end repeat
	set AppleScript's text item delimiters to tid
	return thelist
end makeList

Now here is the bbeidt versions it returns in 18


-- This is a simple translation droplet using BBEdit's replace feature. 


on open fileList
	set StartTime to current date
	set FilesToProcess to {}
	
	-- extract the prefs file from the group  of files
	tell application "Finder"
		repeat with aFile in fileList
			if name of aFile = "FRPrefs.txt" then
				set FRPrefs to aFile
			else
				copy aFile to end of FilesToProcess
			end if
		end repeat
	end tell
	
	repeat with aFile in FilesToProcess
		set filePath to (aFile as Unicode text)
		
		-- use text item delimiters to parse path
		set AppleScript's text item delimiters to ":"
		-- get path to the folder file is in
		set folderPath to (text 1 thru text item -2 of filePath) & ":"
		-- get file's name
		set fileName to text item -1 of filePath
		set AppleScript's text item delimiters to "."
		-- remove extension from name
		try -- next line will error if there is no extension
			set fileNameStub to (text 1 thru text item -2 of fileName)
			set theExtension to "." & text item -1 of fileName
		on error -- there is no extension
			set theExtension to ""
		end try
		set AppleScript's text item delimiters to ""
		
		set translationFilePath to folderPath & fileNameStub & "_converted" & theExtension
		
		set FRPrefs to read FRPrefs as text
		--open the file
		tell application "BBEdit"
			activate
			open alias filePath
			tell text document 1
				set line breaks to Mac
			end tell
			tell text window 1
				--replacing code
				repeat with i from 1 to count of paragraphs in FRPrefs
					set apara to paragraph i of FRPrefs
					set apara to my makeList(apara, "|")
					set fText to item 1 of apara
					set rText to item 2 of apara
					replace fText using rText searching in text 1 of text document 1 options {search mode:literal, starting at top:true, wrap around:false, backwards:false, case sensitive:false, match words:false, extend selection:false}
				end repeat
				
				--write translated text to a new file
				save to file translationFilePath
				close saving no
			end tell
		end tell
		
	end repeat
	set EndTime to current date
	set SubRoutineTimeDuration to (EndTime - StartTime)
	display dialog SubRoutineTimeDuration
end open

on makeList(astring, asep)
	set tid to AppleScript's text item delimiters
	set AppleScript's text item delimiters to asep
	set thelist to {}
	repeat with i from 1 to count of text items in astring
		copy text item i of astring to end of thelist
	end repeat
	set AppleScript's text item delimiters to tid
	return thelist
end makeList

I am running both of these on the same 18 mb file

thsnks for the help
mm

I don’t have any files that big… How fast does this run for you?

choose file without invisibles
set sourceFile to POSIX path of result

choose file name -- Oops; Should be `file name` instead of `file`
set outputFile to POSIX path of result

set timer to current date

do shell script "/usr/bin/ruby -e 'print $stdin.read.gsub(\"\\r\\n\", \"\\r\")' < " & quoted form of sourceFile & " > " & quoted form of outputFile

set timer to (current date) - timer

Bruce,

what am I suppose to select as the output file ?

mm

You’re supposed to destroy an existing file, of course! :rolleyes: Actually, that was just a mistake; I edited the script above.

ok so the results are in … just doing the LFCR to CR

Ruby scores a time of 0
BBedit scores a time of 3
while AppleScript comes in last with 4 :frowning:

Now there are more find and replaces that I am doing notice in both the AS and BBe scripts that I posted earlier I am supplying an input file(s) as well as a prefs.txt file this is a list of pipe seperated find and replace so that the user can do all sorts of find and replace.

So with the ruby I am worried about how to escape special characters I haven’t really got to know ruby and I am reluctant to use something I know nothing about; is that also a standard install I know I had the ruby extension install on my computer so I could make ruby applescript but I think it the basic install was already there.

mm

I believe ruby is part of a standard install as of Mac OS X 10.3.

Hey Bruce,

What about just using sed? or even tr

sed ‘/\r/\n/g’ < Input.txt > output.txt

From the terminal it is as fast as my computer can write to disk. There may be a slight performance hit in wrapping it in a do shell script.

Andy

Browser: Safari 412
Operating System: Mac OS X (10.4)

Edit:

Actually, the OP seems to want
sed ‘/\r\n/\r/g’ < Input.txt > output.txt

A slightly different script where you can manually add more replacements:

choose file without invisibles
set sourceFile to POSIX path of result

choose file name
set outputFile to POSIX path of result

--set timer to current date

do shell script "/usr/bin/ruby -e '
text = $stdin.read
text.gsub!(\"\\r\\n\", \"\\r\")
text.gsub!(\"find this\", \"replace with\")
text.gsub!(\"find that\", \"replace with\")
print text
' < " & quoted form of sourceFile & " > " & quoted form of outputFile

--set timer to (current date) - timer

I believe the only characters to watch out for would be quotes (you’d have to protect double quotes from ruby and single quotes from the shell).

If you can give me a small example of your prefs.txt file, I could try making a script that reads it.

tugboat, that should also work. :slight_smile: