Highlight files using non-acceptable characters

Hi there,

Yeah, I’d read that part thanks. I was interested in using perl to achieve the illegal character replacement.



Using the ‘NOT’ character helps:

perl -pe ‘s/[^[:alnum:],^,^ ,^-]//g’

Change to suit your needs.


What this does is change all characters in the string that are not alphanumeric (^[:alnum:]) and not underscores (^_) or spaces (^ ) or hyphens (^-) into underscores: the _ between the last two slashes.

If anyone is interested here is a Ruby script that corrects the file name and then renames the file.
One advantage here is if there is a lot of files this will be much faster then the equivalent AppleScript.



Just a note to clear up a latent usage error in the Perl code that has been offered in this thread.

The outer square brackets introduce a character class in the regexp used by the s operator, but the example above has a few too many characters in it.

You only need one caret at the beginning of the character class to invert/complement the entire character class. Any subsequent carets are taken as members of the (pre-complement) class. Commas are also taken as literal members of the class, there is no separator character inside character classes. To include a dash (hyphen), it must be at the end of the class (as in the above example) or escaped.

So, a version that does not inadvertently allow comma and caret to slip through would look like this:

perl -pe ‘s/[^[:alnum:]_ -]/_/g’

Here is some AppleScript that demonstrates the Perl programs above plus one other variation that uses the tr operator instead of s. The tr operator does not use regexps (though it does allow for ranges in a manner similar to that of regexp’s character classes), so it might be easier to follow for people unfamiliar with regexps. The c option at the end complements the selection class (analogous to the leading caret in a regexp character class).

set convert_cmd to "perl -pe 's/[^[:alnum:],^_,^ ,^-]/_/g'"
set convert_cmd2 to "perl -pe 's/[^[:alnum:]_ -]/_/g'"
set convert_cmd3 to "perl -pe 'tr/A-Za-z0-9_ -/_/c'"

set data_str to "alnum123space underscore_dash-comma,caret^other!@#$%&*()+=[]~`{}\\|:;'\"\"<>./?"
set expected_str to "alnum123space underscore_dash-comma_caret_other_____________________________"
set data_cmd to "/bin/echo -n " & quoted form of data_str

set a to do shell script data_cmd & "|" & convert_cmd
set b to do shell script data_cmd & "|" & convert_cmd2
set c to do shell script data_cmd & "|" & convert_cmd3

{a is equal to expected_str, a, b is equal to expected_str, b, c is equal to expected_str, c} --> {false, "alnum123space underscore_dash-comma,caret^other_____________________________", true, "alnum123space underscore_dash-comma_caret_other_____________________________", true, "alnum123space underscore_dash-comma_caret_other_____________________________"}

Is the perl version

perl -pe 's/[^[:alnum:]_ -]/_/g'

much faster than the UNIX version

/usr/bin/tr -c '[:alnum:]- ' _

or why is the perl version preferred?

Thanks. I guess I should have done a better job of testing.

Out of habit. :wink:

If I try and use the Applescript with Craig’s Ruby version I get an error.
Here’s my applescript:-

set path_to_this_ruby_file to quoted form of "/users/nik/desktop/rename_files.rb"
set path_to_folder to quoted form of "/users/nik/desktop/testtesttest/"
do shell script "ruby " & path_to_this_ruby_file & space & path_to_folder

Is it my applescript or could it be because I’m on 10.4.11?



What is the error?