Suggestions for duplicating directory structure

I have folder A with lots of subdirectories and html files. I need to parse the html files, create text files from the results and put it in a duplicate directory stucture. I have the recursive routine to parse the html files but unsure how to create the directory structure. Suggestions?

tell application "Finder"
	
	set TargetDir to (choose folder)
	set NewFolderName to (name of TargetDir) & " extracted"
	
	if (exists folder NewFolderName of (parent of TargetDir)) then
		delete folder NewFolderName of (parent of TargetDir)
	end if
	make new folder at (parent of TargetDir) with properties {name:NewFolderName}
	set the destination_folder to folder NewFolderName of (parent of TargetDir) as alias
	
	ExtractEntries(TargetDir) of me
end tell

on ExtractEntries(thisFolder)
	tell application "Finder"
		set ItemList to {}
		
		set ItemList to (every item of thisFolder)
		repeat with afolderitem in ItemList
			if class of afolderitem is folder then
				--log "this is a folder, recursivly enter it"
				ExtractEntries(afolderitem) of me
			else if name of afolderitem contains ".html" and name of afolderitem does not contain "index.html" then
				--This is where we parse the text file

			end if
		end repeat
	end tell
end ExtractEntries

Not sure if this is the best way to do it or not, but I just threw this together and is working at a limited level.

Choose the folder you want replicate the directory structure of from within (it doesn’t replicate the top level of the folder you choose) and point it to the folder you want to recreate that structure within and it does it’s thing.

set sourcePath to choose folder
set replicatePath to choose folder
set mkDirCommand to ""
set replicateItems to paragraphs of (do shell script "cd " & quoted form of POSIX path of sourcePath & "; find ./ -name \"*\" -type d")
repeat with i from 2 to count of replicateItems by 1
	set mkDirCommand to mkDirCommand & "mkdir -p " & (text 4 thru -1 of item i of replicateItems) & ";"
end repeat
do shell script "cd " & quoted form of POSIX path of replicatePath & ";" & mkDirCommand

James, this might also work:

choose folder with prompt "Duplicate folder structure of this folder:"
set sourcePath to POSIX path of result

choose folder with prompt "Make the duplicate folder structure inside this folder:"
set replicatePath to POSIX path of result

do shell script "cd " & quoted form of sourcePath & "; /usr/bin/find . -type d  | /usr/bin/sed -e 's:^\\./:" & replicatePath & ":g' -e 's: :\\\\ :g' | /usr/bin/xargs /bin/mkdir -p"

However, forward slashes in the names might break the sed statements (remember, those display as colons in the shell).

Good 'ol sed… mighty strong and mighty annoying :smiley:

Sorry to report, Gents, that neither James’ nor Bruce’s renditions work for me. This has been a long standing problem for me - I’d like to write a script that recreates my Scripts folder structure then loads each script and saves it as text so it would be easy to search. I know how to do all but the first step - making the duplicate folder structure without carefully parsing out all the spaces the folder names might (and do) contain.

Adam,

Can you post a example of a directory/file structure that does not work? I didn’t do much testing, but the bit I did do worked without any problem in a few differnet scenarios. Let me see your’s and I’ll see what I can come up with.

As an edit to this Adam, I did get mine to fail actually, but Bruce’s worked where mine failed. Changing mine to this though made it work :smiley:

set sourcePath to choose folder
set replicatePath to choose folder
set mkDirCommand to ""
set replicateItems to paragraphs of (do shell script "cd " & quoted form of POSIX path of sourcePath & "; find ./ -name \"*\" -type d")
repeat with i from 2 to count of replicateItems by 1
	set mkDirCommand to mkDirCommand & "mkdir -p " & (quoted form of (text 4 thru -1 of item i of replicateItems)) & ";"
end repeat
do shell script "cd " & quoted form of POSIX path of replicatePath & ";" & mkDirCommand

Can you try this and let me know if it works?

Took 22 seconds to do it, but worked a treat. Thanks, James. :slight_smile:

Glad to help Adam!

Hi Guys

Not sure if i might be sticking my oar in here where its not wanted or i might even be just way of the mark.
But could you use the “rsync” unix command to replicate a folder like for like.
is this what your trying to do here…
I use it to back up some cutter files from one machine to another.

Unless there’s a way to have rsync replicate only folders, but not files, then James’ version is what I need. In brief, I’m using it to replicate all the Folders in my Scripts folder in the same hierarchy in a Scripts_Text folder. My purpose is to go through the folders in ~/Library/Scripts, loading each script and writing it as text to the same-named folder in the Script_Text folder which is fully searchable. By preserving the hierarchy that way, it’s a very simple transformation to get from the path to a “found” text file to the script that “owns” it and open that instead.

The script I’m working on is for the first pass. The next script will be for keeping the text versions current, which I’ll do by using mdfind to discover new ones, load them, and write them out as text in the same hierarchy. If I get it all working nicely, including closing the applications that open as you load scripts that call them, so as not to have 20 apps running at once, it will make up for a rather disorganized Scripts folder with about 2000 files in too many folders. :lol: (it says in the fine print) :lol:

Nigel Garvey and I worked on this once before (he did most of the heavy lifting), but it was never entirely satisfactory before we moved on to other things. Things are a bit slow now, so I thought I’d have another kick at the can.

Hi,

Here’s an AppleScript example if you want to look at it:


set old_folder to (choose folder)
tell application "Finder"
	set new_folder to (make new folder at desktop with properties {name:"NewFolder"})
end tell
MakeDir(old_folder, new_folder)
--
on MakeDir(old_folder, new_folder)
	-- get file list
	tell application "Finder"
		set file_list to every file of old_folder
	end tell
	-- process files and place processed file in new_folder
	repeat with this_file in file_list
		ProcessFile(this_file as alias, new_folder as alias)
	end repeat
	-- get sub folder list
	tell application "Finder"
		set sub_list to every folder of old_folder
	end tell
	repeat with this_sub in sub_list
		tell application "Finder"
			-- get sub folder name
			set sub_name to name of this_sub
			-- make new sub folder in new_folder
			set new_sub to (make new folder at new_folder with properties {name:sub_name})
		end tell
		-- pass this_sub and new_sub to this subroutine to make more folders
		MakeDir(this_sub, new_sub)
	end repeat
	return
end MakeDir
--
on ProcessFile(f, save_folder)
	-- process file f and place in save_folder
	return
end ProcessFile

gl,

I like it Kel - :wink:

Interestingly, it takes the same time to do it on my machine as James’ version, probably because of his shell call in the repeat loop. Thanks! I’ll have a look at putting the rest of my file processing in the ProcessFile handler, and do the directory and the conversion at the same time. The only tricky bit there is “garbage” collection - keeping track of how many apps have been opened, quitting them again, and dealing with occasional errors along the way; some not serious.

Same speeds, interesting. I would have expected a pure AS solution to run faster than my shell. Oh and Adam, I actually never call the shell in the repeat loop, just set up the command. I try to actually invoke the shell as little as possible to avoid slow downs :smiley:

Missed that James. :confused:

Technically, it’s not “pure AppleScript”, since it uses the Finder. Given that, any similarity in speed is amazing. Presumably, it depends on the structure of the hierarchy. I’ll take a look later on.

James’s “find” shell script has enabled me to speed up an exercise I did a couple of years ago to reproduce an entire folder hierarchy by constructing a single “mkdir” command with a compressed hierarchy specification. eg:

Originally, I used the Finder to get every folder of the entire contents of the hierarchy as Unicode text, sorted the result lexically by path, and parsed the resulting list, converting any dodgy characters to the POSIX equivalents along the way. But James’s shell script is much faster and returns the paths already sorted, with all characters in POSIX form and no trailing delimiters. Makes everything much easier. Thanks, James. :slight_smile:

considering case
	main()
end considering

on main()
	set rootPath to POSIX path of (choose folder with prompt "Select the root folder of a hierarchy to replicate.")
	set destinationFolder to (choose folder with prompt "Where do you want the replicate hierarchy to be created?")
	
	-- Get the POSIX paths of the hierarchy's subfolders. (The double slashes in the shell script result will come out in the wash.)
	set subfolderPaths to rest of paragraphs of (do shell script "find " & quoted form of rootPath & " -name \"*\" -type d")
	if (subfolderPaths is {}) then set subfolderPaths to {""} -- No subfolders in the hierarchy.
	
	-- Construct a POSIX path (without a trailing slash) for the root folder of the duplicate hierarchy,
	-- ensuring that the folder name's unique at the destination.
	set astid to AppleScript's text item delimiters
	set AppleScript's text item delimiters to "/" as Unicode text
	set rootName to text item -2 of rootPath
	set newRootName to rootName
	set namesAtDestination to (list folder destinationFolder)
	set n to 0
	repeat while (newRootName is in namesAtDestination)
		set n to n + 1
		set newRootName to rootName & " " & n
	end repeat
	set newRootPath to (POSIX path of destinationFolder) & newRootName
	set AppleScript's text item delimiters to astid
	
	-- Construct and use a single mkdir command for the entire duplicate hierarchy.
	do shell script getMkdirString(subfolderPaths, newRootPath)
end main

on getMkdirString(subfolderPaths, newRootPath)
	script o
		property paths : subfolderPaths
		property outputComponents : {"mkdir -p " & quoted form of newRootPath}
		-- Some pre-defined stuff for speed.
		property slash : "/" as Unicode text
		property slashBrace : "/{" as Unicode text
		property comma : "," as Unicode text
		property braceComma : "}," as Unicode text
		
		-- A recursive subhandler that analyses how the paths relate to each other
		-- and decides how to structure the 'mkdir' string.
		on parsePaths(a, b)
			-- At each recursion, items a thru b of the list represent sibling subfolders and their descendents.
			-- Each sibling is sorted immediately before its own descendents, so item a must be an immediate
			-- child of the previous level. The other items are either its decendents, siblings, or siblings' descendents.
			set currentSibling to item a of my paths
			if (a = b) then
				-- Item a is a childless subfolder with no siblings. Append "/'<name>'," to the output.
				set end of my outputComponents to slash & quoted form of text item -1 of currentSibling & comma
			else
				if (item b of my paths begins with currentSibling) then
					-- Item a has descendents but no siblings. Append "/'<name>'" to the output.
					set end of my outputComponents to slash & quoted form of text item -1 of currentSibling
					set closingBraceRequired to false
				else
					-- Item a has siblings. It may or may not have descendents. Append "/{'<name>'" to the output.
					set end of my outputComponents to slashBrace & quoted form of text item -1 of currentSibling
					set closingBraceRequired to true
				end if
				-- Check the other paths in this recursion.
				set i to a + 1
				repeat with j from i to b
					if (item j of my paths begins with currentSibling) then
						-- This is a descendent of the current sibling. Skip it for now.
					else
						-- This is a new sibling.
						if (j > i) then
							-- If the previous sibling had descendents, recurse to sort them out.
							parsePaths(i, j - 1)
						else
							-- Otherwise, just append a comma to the output.
							set end of my outputComponents to comma
						end if
						-- Append the new sibling's name to the output and continue the repeat.
						set currentSibling to item j of my paths
						set end of my outputComponents to quoted form of text item -1 of currentSibling
						set i to j + 1
					end if
				end repeat
				-- At the end of the repeat, sort out any descendents of the last sibling.
				if (i > b) then
				else
					parsePaths(i, b)
				end if
				-- Append a closing brace and a comma to the output if there were siblings at this level.
				if (closingBraceRequired) then set end of my outputComponents to braceComma
			end if
		end parsePaths
	end script
	
	set astid to AppleScript's text item delimiters
	set AppleScript's text item delimiters to "/"
	
	tell o to parsePaths(1, (count subfolderPaths))
	
	-- Coerce the resulting list to Unicode text and tidy up the result.
	set AppleScript's text item delimiters to ""
	set mkdirString to o's outputComponents as Unicode text
	set mkdirString to text 1 thru -2 of mkdirString
	set AppleScript's text item delimiters to ",}"
	set mkdirString to mkdirString's text items
	set AppleScript's text item delimiters to "}"
	set mkdirString to mkdirString as Unicode text
	
	set AppleScript's text item delimiters to astid
	
	return mkdirString
end getMkdirString

Damn technicallities. :lol:

As I suspected, the time difference between James’s and Kel’s scripts is quite a lot on my 2MHz DP G5, Tiger 10.4.8, although the difference between James’s and mine is quite small, making mine possibly overkill. :rolleyes:

Kel: replicate a hierarchy’s folder structure (but not the root folder) to a folder created on the desktop.
JN: replicate a hierarchy’s folder structure (but not the root folder) to a chosen, pre-existing folder.
NG: replicate a hierarchy’s folder structure (including the root folder) to a chosen folder, renaming the root folder if necessary.

Running in Script Editor, timed using GetMilliSec from just after the ‘choose folder’(s) to the completion of the task, and with the references to files in Kel’s script commented out:

My Scripts folder (27 folders + root):
Kel: 1.448 seconds
JN: 0.186 seconds
NG: 0.098 seconds

My Documents folder (120 folders + root):
Kel: 6.33 seconds
JN: 0.898 seconds
NG: 0.358 seconds

A folder containing just one other folder:
Kel: 0.093 seconds
JN: 0.034 seconds
NG: 0.059 seconds

It’s possible to write a faster vanilla/Finder script, placing more emphasis on the vanilla and not using recursion. The following does the above tests in 0.562 seconds, 2.35 seconds, and 0.068 seconds respectively.

set old_folder to (choose folder)

set astid to AppleScript's text item delimiters
considering case
	set AppleScript's text item delimiters to return as Unicode text
	tell application "Finder"
		set old_paths to (every folder of entire contents of old_folder) as Unicode text
		set new_folder to (make new folder at desktop with properties {name:"NewFolder"}) as alias
	end tell
	
	-- Derive the new folder paths from the old ones and the destination folder path.
	set AppleScript's text item delimiters to old_folder as Unicode text
	set new_paths to text items of old_paths
	set AppleScript's text item delimiters to new_folder as Unicode text
	set new_paths to paragraphs of (new_paths as Unicode text)
	set old_paths to paragraphs of old_paths
	
	-- Create the new folders.
	set AppleScript's text item delimiters to ":" as Unicode text
	repeat with this_path in new_paths
		tell application "Finder"
			make new folder at folder (text 1 thru text item -3 of this_path) with properties {name:(text item -2 of this_path)}
		end tell
	end repeat
end considering
set AppleScript's text item delimiters to astid

It works because ‘entire contents’ always finds and returns the outer folders before the inner ones. The outer folders thus come before the inner ones in the ‘new_paths’ list too and are created first by the script. The parallel lists ‘old_paths’ and ‘new_paths’ can be used for the file work later.

Hi everybody,

I don’t see where other scripts are processing files. I must be missing something and will read this thread again.

gl,

For my purposes, Kel, I don’t want to process the files at the same time - I wanted the empty structure and then I’d process the files. See post #12

Nigel: yours is about twice as fast at my task than James’ but about the same as Kel’s if I take the file processing part out. (I didn’t do a lotsa test).

I ended up just sending the source and dest folder tot he sub. It may not be fast or pretty, but it worked and I only needed it once (you can get the entire script and read what it’s for here).

Here’s the bits of the script


tell application "Finder"
set TargetDir to (choose folder)
	set NewFolderName to (name of TargetDir) & " extracted"
	
	if (exists folder NewFolderName of (parent of TargetDir)) then
		delete folder NewFolderName of (parent of TargetDir)
	end if
	make new folder at (parent of TargetDir) with properties {name:NewFolderName}
	set the destination_folder to folder NewFolderName of (parent of TargetDir) as alias
	
	ExtractEntries(TargetDir, destination_folder) of me
end tell

on ExtractEntries(thisFolder, destfolder)
	tell application "Finder"
		set ItemList to {}
		set ItemList to (every item of thisFolder)
		repeat with afolderitem in ItemList
			if class of afolderitem is folder then
				set x to name of afolderitem
				if not (exists folder (name of afolderitem) of (parent of destfolder)) then
					make new folder at destfolder with properties {name:x}
				end if
				set xx to folder x of destfolder as alias
				--log "this is a folder, recursivly enter it"
				ExtractEntries(afolderitem, xx) of me
			else if name of afolderitem contains ".html" and name of afolderitem does not contain "index.html" then
				--do stuff to files
				
			end if
		end repeat
	end tell
end ExtractEntries