New scriptor trying to compare files

Hello,

I recently discovered I could use scripts to have my computer do slave work for me. Being a mathematician, I immediately fell in love with tha idea. And so my story begins… I’ve started to learn the basics, and I got my computer to do all kinds of small little routines for me. Then I decided to attack a bigger project: clean up my large iTunes music folder. The goal is to track down identical items in content. The hierarchy is as follows:
Music Folder > Artists > Albums > Files (songs)
We make the assumption that 2 files lying in different Albums are necessarily distinct, and thus need not be compared.
So, I built a nested routine of repeats roughly saying:

repeat with i from 1 to (number of Artist folders)
tell folder i
repeat with j from 1 to (number of Album folders)
tell folder j
[Here goes the compare/delete routine]
end tell
end repeat
end tell
end repeat

I want to isolate groups of files such that all members in each group share the same content. In particular, they have the same size, but that’s an insufficient condition to make a definite decision. Those files do not have the same name, rather, iTunes automatically names them as “title.mp3”, “title 1.mp3”, “title 2.mp3”, etc… This makes it very hard to compare the files names.

I’m at sea. I have no idea how to solve my problem without writing a routine that would be very slow. One idea would be to take for granted that 2 files are identical if they have the same size and the same first word in their title. Then again, although that’s very likely to be true, there could be exception.

So, I basically need a “tag” that uniquely describes the content of a given file.

I hope I’m making sense. I’m so new at this. I would appreciate any word of wisdom…

Best wishes to all,

Yann

Model: Mac Book Pro
AppleScript: Whichever came with the laptop
Browser: Safari
Operating System: Mac OS X (10.4)

Yann:

My first suggestion is to try to let iTunes do all the work for you. If you go to the View menu, there is a choice for Show Duplicates. That works nicely for most duplicate tracks.

For others, the following script uses iTunes to cycle through all the artists, then check to see if there are any duplicate track names by each artist:


set the_artists to {}
tell application "iTunes"
	if not (exists playlist "Possible Dupes") then
		make new playlist with properties {name:"Possible Dupes"}
	end if
	set all_artists to artist of every track of playlist 1
	repeat with an_artist in all_artists
		if the_artists does not contain (an_artist as string) then
			set artist_track_names to (the name of every track of playlist 1 whose artist is (an_artist as string))
			repeat with a_track_name in artist_track_names
				set common_tracks to (every track of playlist 1 whose name is (a_track_name as string) and artist is (an_artist as string))
				if (count common_tracks) > 1 then
					repeat with every_track in common_tracks
						duplicate every_track to playlist "Possible Dupes"
					end repeat
				end if
			end repeat
			set end of the_artists to (an_artist as string)
		end if
	end repeat
end tell

If it does not work the first time, run it again. I purposely added a duplicate file before I ran it, but it did not find it until the second run through, although I am unsure why.

You probably don’t want to mess around with the actual iTunes folder directly, as the XML file that organizes your iTunes is based on the organization of those folders. It’s best to have iTunes do the work for you, so as to keep everything clean.

Good luck,