Depending on the application that saved the text file, files with their paragraphs return delimited or linefeed delimited can have either type ident "com.apple.traditional-mac-plain-text"or type ident “public.plain-text”, so the only way I can find to tell is to test for the character to see if it’s ASCII 10 or ASCII 13. That’s OK, but my question is: Is there a simple way to determine whether text is Unicode or plain ASCII?
Not really; only way to be completely sure is to know in advance. Given a file that may or may not be Unicode, the usual thing is to see if the file starts with a byte-order mark (BOM); if it does, it’s a fair bet (though not 100% guaranteed) that it’s Unicode. If it doesn’t, then it may use some other encoding, or it may be that the BOM was omitted. More intelligent encoding sniffers will analyse the data and provide a best guess as to what encoding it is, e.g. here’s one for Python I ran across the other day:
You could easily use it via ‘do shell script’ - wouldn’t be hard to knock up a Python script that reads the raw file data, sniffs its encoding and then, if sufficiently confident, converts it from that encoding to UTF8 and returns that. Or you could wrap it up in a scriptable faceless background application if you find that more convenient (I plan on rolling it into TextCommands at some point, but you could always do it yourself)
It’s not an option, it’s part of the normal behavior. Without an other options specified, the output (for a typical ASCII text file) would be something like this:
When using this tool inside AppleScript, you might find the options ˜-b’ and ˜-i’ helpful. (Example follows.)