how to deal with file/folder names containing special characters

hi all

I need your help with a code that deals with finding file names containing special characters like * / \ : ? " |

the reason is I am doing a project that transfer files and folders from Mac to Windows. But names containing above characters cant be moved to windows.

the part of the code I write is like this

set illegal_syntax to paragraphs of (do shell script "find " & quoted form of POSIX path of oneFolder & " -name '*\**'")

if (illegal_syntax is not equal to {""}) then

repeat with each_record in illegal_syntax

do shell script "/usr/bin/ditto -c -k -rsrc --keepParent " & quoted form of each_record & space & quoted form of (each_record & ".zip")

end repeat

end if

First of all that ‘**’ will gives me an error.

Second, what i tried to do is once the file/folder is found, zip it at current location. But the problem is how can I zip it without the special character? or should I just give it a name, like “originalfile.zip”;?

Last, or maybe you have better idea to deal with this situation? BTW, I can not rename or delete the file/folder because customer won’t allow me to do it.

That is because AppleScript (like the sh shell) gives the backslash character in string literals a special meaning. The most common purpose is to let a string literal contain an internal double quote character. To do this you put a backslash in front of the embedded double quote character to prevent that double quote character from ending the string literal. But because of this special meaning of the backslash, if you want a backslash in a string literal you also have to put a backslash in before it. So to get * as the value of a string written as a string literal, you must write “*\**”.

Next, if you want to apply identical processing to all the files that have these undesirable characters, then you can test for them all at once with a slight change to your -name conditional in your find command. Try -name ‘[/\:?"|]*’ (the double backslash is for the filename pattern interpretation that find will do), which when you write it as an AppleScript string literal looks like this: “-name ‘[/\\:?"|]*’” (each backslash is doubled, and the double quote gets a backslash too; these escapes are undone when AppleScript converts the string literal to an internal string value).

If you would like to delete the characters or replace them with a specific character or string you could use some handlers like these (demonstration in the run handler):

on run
	set str to "a fi|e name: with l*ts of \"b/\\d\" Chars?"
	
	set replaced to replaceBadChars(str, "_")
	set deleted to deleteBadChars(str)
	{str, replaced, deleted}
	--> {"a fi|e name: with l*ts of \"b/\\d\" Chars?", "a fi_e name_ with l_ts of _b__d_ Chars_", "a fie name with lts of bd Chars"}
end run

(* switchText From: http://bbs.applescript.net/viewtopic.php?pid=41257#p41257
Credit: kai, Nigel Garvey*)
to switchText from t to r instead of s
	local d
	set d to text item delimiters
	try
		set text item delimiters to s
		set t to t's text items
		-- The text items will be of the same class (string/unicode text) as the original string.
		set text item delimiters to r
		-- Using the first text item (beginning) as the first part of the concatentation means we preserve the class of the original string in the edited string.
		tell t to set t to beginning & ({""} & rest)
		set text item delimiters to d
	on error m number n from o partial result r to t
		set text item delimiters to d
		error m number n from o partial result r to t
	end try
	t
end switchText
to replaceBadChars(str, replacementStr)
	local badChars
	set badChars to characters of "*/\\:?\"|"
	-- The plural name of "text item delimiters" would tend to indicate that you could use a list many such delimiters. In practice (on my machine) only the first one is effective.
	-- In other words, when I tried "switchText from str to replacementStr instead of badChars" the only character that was replaced was the first: asterisk.
	repeat with ch in badChars
		set str to switchText from str to replacementStr instead of contents of ch
	end repeat
	str
end replaceBadChars
to deleteBadChars(str)
	replaceBadChars(str, "")
end deleteBadChars

You might use something like set zipName to deleteBadChars(each_record) then use zipName for the name of your zip file that contains each_record. Though if you are processing a folder that has subfolders and those subfolders have the undesirable characters in them, this will likely fail because you will be deleting/replacing the characters in the subfolder names too, likely resulting in pathnames that reference non-existent subfolders.

Usually I try to avoid making ‘archive’ files that contain only a single file, so I would probably try to package up the files into a single “.zip” file. But you are in the best position to make a decision like that since we do not know the details of why you need to package up the files in the first place. Yes, you say that they are going to Windows, but how are they going to be used there? How is the Windows user going to extract them from the “.zip” file? Will their extractor even be able to handle extracting archived filenames that have illegal characters in them? If so, what does it do with the characters? Individually replace them? Replace them as a group? Drop them from the resulting filename?

Also, this particular find technique will fail if you process any files that have carriage returns or linefeeds in their names.

Model: iBook G4 933
AppleScript: 1.10.7
Browser: Safari 3.0.4 (523.12)
Operating System: Mac OS X (10.4)

HI chris,

Thanks for your replying. your code does help me a lot.

To answer the questions you asked at the bottom of your reply, I will simply put my whole code here.


set shareUser to "administrator"
set sharePass to "password"
set shareLoc to "10.11.22.115/share"
set mountDir to "share"
--
set volumesDir to ":users:administrator:desktop:" & mountDir
set cmdDir to "/users/administrator/desktop/" & mountDir
--
set jobFolder to (("Users:administrator:Desktop:") & "job:") -- path to desktop points to desktop folder of the current user
--

--//check if the connection to the remote location is available or not
set outp to (do shell script "ping -c 1 10.11.22.115 | grep '0% packet loss'")

if (outp = "1 packets transmitted, 1 packets received, 0% packet loss") then
	
	-- // check if the mapped drive is available or not
	try
		alias (volumesDir) -- // do nothing if it exists
	on error
		-- // create the file if it does not exist
		do shell script ("mkdir -p " & cmdDir)
	end try
	
	try
		do shell script ("mount_smbfs //" & shareUser & ":" & sharePass & "@" & shareLoc & " " & cmdDir)
		--display dialog cmdDir & " is mounted."
		
	end try
	
	-- //aumatically move folders
	set sFolders to paragraphs of (do shell script "find " & quoted form of POSIX path of jobFolder & " -type d -mtime +360 -maxdepth 1")
	if (sFolders is not equal to {""}) then
		
		repeat with oneFolder in sFolders
			
			set is_font to paragraphs of (do shell script "find " & quoted form of POSIX path of oneFolder & " -type d -name '*Fonts*'")
			
			if (is_font is not equal to {""}) then
				
				repeat with fontFolder in is_font
					
					do shell script "/usr/bin/ditto -c -k -rsrc --keepParent " & quoted form of fontFolder & space & quoted form of (fontFolder & ".zip")
					
				end repeat
				
			end if
			
			try
				
				do shell script "mv " & quoted form of oneFolder & space & quoted form of POSIX path of volumesDir
				
			on error
				
				do shell script "rm -R " & quoted form of POSIX path of oneFolder
				
			end try
			
		end repeat
		
	end if
else
	error number -128
end if

As you can see, this project is to find some folders and push them over to windows environment. The reason is at the window side, there is a program that will read all folders coming to a specific location and automatically burn them into DVD. I have to zip all the folder name containing “font” is because they contain font files that generated by job created in Mac. those font files can not be viewed properly in windows.

From the code you provided I will try to do some thing slightly different. I will try to find those files/folder containing special characters first, then zip it with same name but replace the character with “_” and delete the original. please give me any advice on how to approach this goal.

Just a quick footnote: although AppleScript’s text item delimiter specifications can be set as a list of strings, no version of OS X has ever interpreted more than the first item of that list – the fault is not in your machine but in AppleScript itself.

Hi guys I finally get this problem solved. the program is kinda messy but it works.

here is the details


set shareUser to "administrator" --area to change
set sharePass to "password" --area to change
set shareLoc to "10.11.22.115/share" --area to change
set mountDir to "share"
--
set volumesDir to ":users:administrator:desktop:" & mountDir
set cmdDir to "/users/administrator/desktop/" & mountDir
--
set jobFolder to (("Users:administrator:Desktop:") & "job:") -- path to desktop points to desktop folder of the current user
--

---
to switchText from t to r instead of s
	local d
	set d to text item delimiters
	try
		set text item delimiters to s
		set t to t's text items
		-- The text items will be of the same class (string/unicode text) as the original string.
		set text item delimiters to r
		-- Using the first text item (beginning) as the first part of the concatentation means we preserve the class of the original string in the edited string.
		tell t to set t to beginning & ({""} & rest)
		set text item delimiters to d
	on error m number n from o partial result r to t
		set text item delimiters to d
		error m number n from o partial result r to t
	end try
	t
end switchText

---
to replacebadchars(str, replacementStr)
	local badChars
	set badChars to characters of "*\\?\"|"
	-- The plural name of "text item delimiters" would tend to indicate that you could use a list many such delimiters. In practice (on my machine) only the first one is effective.
	-- In other words, when I tried "switchText from str to replacementStr instead of badChars" the only character that was replaced was the first: asterisk.
	repeat with ch in badChars
		set str to switchText from str to replacementStr instead of contents of ch
	end repeat
	str
end replacebadchars

--//check if the connection to the remote location is available or not
set outp to (do shell script "ping -c 1 10.11.22.115 | grep '0% packet loss'") --area to change

if (outp = "1 packets transmitted, 1 packets received, 0% packet loss") then
	
	-- // check if the mapped drive is available or not
	try
		alias (volumesDir) -- // do nothing if it exists
	on error
		-- // create the file if it does not exist
		do shell script ("mkdir -p " & cmdDir)
	end try
	
	try
		do shell script ("mount_smbfs //" & shareUser & ":" & sharePass & "@" & shareLoc & " " & cmdDir)
		--display dialog cmdDir & " is mounted."
		
	end try
	
	-- //aumatically move folders
	set sFolders to paragraphs of (do shell script "find " & quoted form of POSIX path of jobFolder & " -type d -mtime +360 -maxdepth 1")
	if (sFolders is not equal to {""}) then
		
		repeat with oneFolder in sFolders
			set illegal_syntax to paragraphs of (do shell script "find " & quoted form of POSIX path of oneFolder & " -name '*[*\\\\?\"|]*'")
			
			
			if (illegal_syntax is not equal to {""}) then
				
				repeat with each_record in illegal_syntax
					
					do shell script "/usr/bin/ditto -c -k -rsrc --keepParent " & quoted form of each_record & space & quoted form of (each_record & ".zip")
					
					set zipped_syntax to (do shell script "find " & quoted form of POSIX path of oneFolder & " -name '*[*\\\\?\"|]*'.zip")
					
					set zipped_no_syntax to replacebadchars(zipped_syntax, "_")
					
					
					do shell script ("cp " & quoted form of zipped_syntax & space & quoted form of zipped_no_syntax & space & "| rm " & quoted form of zipped_syntax)
					
				end repeat
				
			end if
			
			
			set is_font to paragraphs of (do shell script "find " & quoted form of POSIX path of oneFolder & " -type d -name '*Fonts*'")
			
			
			if (is_font is not equal to {""}) then
				
				repeat with fontFolder in is_font
					
					do shell script "/usr/bin/ditto -c -k -rsrc --keepParent " & quoted form of fontFolder & space & quoted form of (fontFolder & ".zip")
					
				end repeat
				
			end if
			
			try
				
				do shell script "mv " & quoted form of oneFolder & space & quoted form of POSIX path of volumesDir
				
			on error
				
				do shell script "rm -R " & quoted form of POSIX path of oneFolder
				
			end try
			
		end repeat
		
	end if
else
	error number -128
end if



you guys are the best. thanks for your help. and of course, any advise and better ideas are all welcome :smiley: