NSString's stringWithString Memory Usage

I have a function that writes text to file using NSString’s writeToFile. My script often writes thousands and thousands of files during execution. The Xcode debug performance memory gauge shows that memory usage will consistently grow to hundreds of megabytes of memory over the course of execution.

This is the function and code I am testing with.


use AppleScript version "2.4" -- Yosemite (10.10) or later
use framework "Foundation"
use scripting additions

on writeTextToFileNS(posixPath, fileContent)
	set theNSString to current application's NSString's stringWithString:fileContent
	set {theResult, theError} to theNSString's writeToFile:posixPath atomically:true encoding:(current application's NSUTF8StringEncoding) |error|:(reference)
	if theResult is false then error (theError's localizedDescription() as text) number 100
end writeTextToFileNS

set fileContent to "abc"

-- Let's write 12 KB of content
repeat with i from 1 to 12
    set fileContent to fileContent & fileContent
end repeat

-- Create lots of files!
repeat with i from 1 to 10000
    writeTextToFileNS("/tmp/writeTextToFileNS/file_" & i, fileContent)
end repeat

After some process of elimination, I’ve determined that the culprit is NSString’s stringWithString. The more this function is called the more memory is used. Am I doing something improper or is there anything I can do to free up the memory allocated by stringWithString?

For comparison, if I switch to using a function that uses AppleScript’s “open for access with write permission” to write files, the memory usage grows to around 90 megabytes and then stays constant.

Thanks!
Lance

Do you have Automatic Reference Counting turned on?

I hadn’t looked, but I see now that Apple LLVM 8.1 - Language - Objective C : Objective-C Automatic Reference Counting is Yes for both Debug and Release targets.

Lance

There’s nothing you can do about memory use in ASObjC. You could try setting theNSString to missing value after the write, to see if that makes any difference.

But I’m not entirely convinced it’s a serious problem. One possible explanation is that AppleScript doesn’t release the memory until it does its own periodic garbage collection. That means a bigger memory footprint – which hardly matters these days – but it means it won’t keep happening indefinitely, which is what you obviously don’t want.

I’d try running it in release mode for a long time, and see what happens in the real world. Even then, the values in Activity Monitor don’t mean what a lot of people assume them to.

I have been in discussion through the bug reporter with the AppleScript developers multiple times about this and they won’t recognize this as an problem which is a huge disappointment. Their answer was that the system will balance this out over time which lacks any attempt to solve the problem. For tight loops with string it will work “fine” and the memory footprint will only be a few hundred megabytes which is still flawed but we can live with it. But when you use much large data from binary files even modern machines will run out of memory, even my machines with 32 gigabytes ran out of ram.

When working in SE it gets even worse with AppleScriptObjC. Repeat the code several times and the 10 previous used components are still allocated. So even NSString objects can soon make modern machines run out of ram. Apple responded very clearly and explained everything very well but they weren’t planned to change this behavior. I think it’s important to mention here since the example code can be run SE, so after completion of the code don’t be surprised that SE keeps using a lot of memory afterwards. It’s something I noticed when writing osaxen as used SE in Xcode’s debugger to debug the osaxen.

Thanks for the suggestions. Glad to know I’m not missing something obvious.

I haven’t observed that garbage collection ever lowers the memory usage. I stopped testing when it reached 1 GB used. I’m not writing a long-running daemon so high memory usage isn’t a show stopper.

I guess I can take some comfort in knowing Apple is at least aware this situation is less than ideal. It sure would be nice if setting to “missing value” was a strong hint to release the memory!

I had thought of implementing some other of my AS string manipulation functions with NSString equivalents, but I guess the more I call stringWithString the worse the issue will become. Too bad.

Lance

One thing you might be able to do, depending on your code, is to use mutable strings. Then you can set them to “” when you’ve finished with them, so they use on a very small amount of memory. So recasting your example above:

    on writeTextToFileNS(posixPath, fileContent)
        set theNSString to current application's NSMutableString's stringWithString:fileContent
        set {theResult, theError} to theNSString's writeToFile:posixPath atomically:true encoding:(current application's NSUTF8StringEncoding) |error|:(reference)
        if theResult is false then error (theError's localizedDescription() as text) number 100
        theNSString's deleteCharactersInRange:{0, theNSString's |length|()}
    end writeTextToFileNS

There may be other side-effects with performance or whatever, but it might be better than nothing.

It is and the used resources will be flagged to be released as soon as possible. Between the flag and the time it is released depends on your code and the tight loop isn’t helping. Garbage collectors like in AS will try not to interfere with your code so it will try to do it at an appropriate time, when changing stack for instance. Unfortunately it doesn’t get triggered when changing stack but it would be nice when it did because then you wouldn’t had such a large memory footprint.

Shane wrote a perfect example of using a mutable string because (at least an CFString) when the string’s length is set to 0 a reallocation will be applied in the background which will remove old resources. You still have the C struct for CFString allocated thousands of times which require the needed memory as well. So you could improve the code that you create a single instance of an NSMutableString and send a pointer of that string («class ocid») to the handler. Something like this:

use AppleScript version "2.4" -- Yosemite (10.10) or later
use framework "Foundation"
use scripting additions

on writeTextToFileNS(posixPath, theNSString)
	set {theResult, theError} to theNSString's writeToFile:posixPath atomically:true encoding:(current application's NSUTF8StringEncoding) |error|:(reference)
	if theResult is false then error (theError's localizedDescription() as text) number 100
end writeTextToFileNS

set fileContent to "abc"

-- Let's write 12 KB of content
repeat with i from 1 to 12
	set fileContent to fileContent & fileContent
end repeat

-- Create lots of files!
set theString to current application's NSMutableString's alloc()'s init()
repeat with i from 1 to 10000
	(theString's setString:fileContent)
	writeTextToFileNS("/tmp/writeTextToFileNS/file_" & i, theString)
end repeat

It doesn’t answer your question but it shows a discipline that can be followed to avoid large memory footprints.

Thanks for all the good points made. Both deleteCharactersInRange and setString are elegant solutions and I can see using each in different scenarios.

I’ve avoided creating objects using alloc() because it’s my understanding that I have to release them myself, and I didn’t think that was possible. If I set to “missing value” will the garbage collector release alloc’d objects on my behalf (theoretically)?

Somewhat related, my “append file” function suffers from the same memory buildup. Since NSString can only create files, I’m using NSFileHandle and NSData something like the following to append. Any recommendations on how to coerce NSData to release its memory?

set theNSString to current application's NSString's stringWithString:fileContent
set theNSData to theNSString's dataUsingEncoding:(current application's NSUTF8StringEncoding)
set theOpenedFile to current application's NSFileHandle's fileHandleForWritingAtPath:posixPath
if theOpenedFile is missing value then error "File cannot be opened." number 100
theOpenedFile's seekToEndOfFile
theOpenedFile's writeData:theNSData
theOpenedFile's closeFile()
theNSString's deleteCharactersInRange:{0, theNSString's |length|()}

No, that’s only the case if you are using manual reference counting, which you’re not. IAC, you could use the string method instead.

I don’t see anything obvious.

Not entirely but it may feel like that. At least AppleScript’s GC makes green light for (AppleScript)Objective-C to release an object which will free itself when the retain counter reach zero. AppleScript is never aware of the Objective-C runtime in it’s own VM and therefore it cannot directly release and retain objects.

I don’t know enough ObjC to pull this off yet, but if I were to write an appendFile function in ObjC that performed the whole dataUsingEncoding + NSData thing, would you expect the memory management to be better?

Yes, you could do that.

I think it would be easier to learn how to use mutable objects, how to write efficient code and other disciplines in AppleScript(ObjC) than to learn an entire new programming language implementing the same inefficient code again where an @autoreleasepool will do some damage control.

I agree. In this case, AppleScript file management is pretty fast, so an ObjC appendFile might be pointless. I was mostly asking just to validate my understanding of how ObjC can be used to enhance AS projects. Thanks for the help!

You’re more than welcome and these are interesting topics. My point is not discouraging you to move to ObjC but in terms of efficiency the code can be further improved in AppleScriptObjC by using a single mutable object. ObjC code will run fast and will mask some of the problems by its performance and there is of course no AppleScript GC holding back ARC.

Allocating and releasing in an tight loop is an “problem” for ObjC as well, almost a general problem in OO programming where you create and release in very tight loops and don’t manually manage memory. You can bring these problems a little bit down with an @autoreleasepool but you remain an high usage of memory bandwidth and the problem is only reduced, not solved. So to write most efficient code in ObjC you have to write in a certain discipline that you could have applied in AppleScriptObjC as well. So at the end, was rewriting the code in ObjC any worth then? To answer your question: ObjC will enhance your code, yes, but in this specific situation the code itself can be enhanced as well and will give an much higher increase.

I still encourage you to learn ObjC because the paradigm of AS and ObjC is not the same and to write the best and most efficient AppleScriptObjC code you should know both. I do not encourage you to learn ObjC to “enhance AS projects” because the level of enhancement will probably never be satisfying compared to the amount of work you had to put into learning ObjC; The result will only end in disappointment.

DJ,

Can you see howe that can be applied in this case, where the OP is using NSString’s -dataUsingEncoding: method? I couldn’t see a way around creating a non-mutable object each time that’s called.