Copy text file one line at a time...

It’s Sunday afternoon and I’m officially brain-dead. :frowning:

A while back, I wrote a MIF-parser that reads in a whole MIF file and moves the markers the around large FrameMaker files before writing the new, resorted string back to disk. In Applescript, it’s slow, but it works… until now.

The MIF files I was given to work with averaged around 100,000 lines, no problem. I just received a file that’s 990,000 lines (>20 MB) and it’s causing out-of-memory errors.

What’s the easiest way to read and write a file line-by-line (or TID by TID) using Applescript? I tried a quick plain vanilla C program using XCode and reading/writing the file – no parsing and it took a matter of seconds (!), but it only works line-by-line if the file uses Unix-styled line endings (eg. includes a \n), which these don’t.

I’m a little too inexperienced with C to knock off a fgets/fputs routine that works with Apple/cr-delimited lines and parsing algorithms at the moment, and I’m a little too under-the-gun to learn enough XCode and Obj-C right now, so I thought I’d try reworking my Applescript program, but I’m fuzzing out under the pressure over what’s probably an incredibly simple process. The main problem is that I’m not 100% the Applescript solution will be fast enough; as mentioned, the C version took seconds, but I don’t know if AS is up to snuff.

As always, any help is greatly appreciated.

– Walt Sterdan

Assuming you’ve opened the file for access and want to read it into a variable for parsing, you should see the Scripting Additions’ definition of “read”. You can read aFile from someNumofBytes for someNumofBytes, or to placeToStop. You can also used delimiters, so if the file is not one continuous string without line feeds, returns, or CR/LFs, you can grab a line using the line end delimiter.

Thanks Adam, that’s all I needed… just a moment of clarity (or sanity). I knew it was easy, but I was just too overwhelmed with the other aspects of the task that I wasn’t focusing clearly. :wink:

Thanks again,

Walt Sterdan

This is a different option. Will take several seconds to start processing (that is, to hold in memory the read operation), but should end being faster than calling “read” so much times:

local t --> so we don't save it later
set t to read alias "path:to:largefile"

repeat with i from 1 to 99999999
	try
		set p to paragraph i of t
		process(p)
	on error --> no more paragraphs
		exit repeat
	end try
end repeat

to process(p)
	--> here your code for line-by-line
end process

Thta’s true, but my original program already reads the whole file into memory before parsing it; the problem is that with the 20+ MB MIF file (almost 1,000,000 lines of text) Applescript dies with an “Out of memory” error. :frowning:

Thanks anyway, though, it’s appreciated.

– Walt Sterdan

Hmmm… True. I tested with a 20MB file and it worked. But it didn’t with a 25MB file :confused:

Sorry for the noise!

No need to apologize, I really do appreciate the help; sometimes it’s an answer that’s not quite right that makes things fall in place. :wink:

– Walt Sterdan