Movie metadata to text file using IMDB

I have a growing collection of movie files. I wanted a text file for each movie file where the text file would contain the metadata for the movie file. As such this script searches the IMDB and writes the text file. The instructions are in the script.

(* This script writes the metadata for movie files to text files by searching the IMDB website. You get one text file for each movie file selected in the Finder *)
(* Please note that this script parses the html code from the IMDB website. This is a particularly difficult task because the format and therefore the html code of the website changes from time to time. As such this script may not always work. *)

--how to use the script--
-- 1. set the variable "movie_poster_folderPath" to the path where you want the text file to be saved
-- 2. select one or more movie files in the front Finder window and run this script
-- 3. the name of the movie is derived from the name of the selected file, the IMDB (internet movie database) is searched for the movie, from the search results the URL of the movie web page is derived, then the text of the movie web page is searched for its metadata. Once metadata is found a dialog box will be presented to you where you can verify that the metadata is correct. If it is not correct then you can enter new search terms in the dialog box and perform a new search. Once you decide that the metadata is correct press the "Write File" button and the text file is written to the path you have set in the script.
-- 4. In cases where the IMDB search does not return appropriate results then the script will try to use Safari to perform the search, so at times you may see Safari launch and some windows open and close in Safari during this process.

--a tip--
-- sometimes the script cannot accurately get the metadata from the IMDB search (maybe the file name isn't correct etc.), as you'll note when the meatadata is presented to you in the dialog box. If after a couple searches you cannot find the correct metadata then I found that the best solution is to manually search the IMDB website yourself. Once you find the proper web page for the movie then copy the movie number of that web page from the URL bar at the top of the Safari window. That movie number can then be inserted into a new search in the metadata dialog box and the proper metadata should be found.

-- get the selected files
tell application "Finder" to set theFiles to the selection

repeat with aFile in theFiles
	-- get the title of the movie
	set movie_path to aFile as Unicode text
	set nmExt to my getName_andExtension(movie_path)
	set movie_file_name to item 1 of nmExt
	set movie_title to my stripYear(movie_file_name)
	
	-- search IMDB and get movie metadata
	repeat
		with timeout of 3600 seconds -- ie. do not time out for at least an hour
			-- setup the movie title into imdb search form ie. word1+word2+word3 etc
			set search_title to my titleIMDB(movie_title)
			try
				-- perform the search and find only the top result
				set search1Header to "http://www.imdb.com/find?s=tt&q="
				try
					set top_result to do shell script "curl " & quoted form of (search1Header & search_title) & " | grep -i \"popular titles\""
				on error
					set top_result to do shell script "curl " & quoted form of (search1Header & search_title) & " | grep -i \"exact matches\""
				end try
				
				-- obtain the movie number from the top result
				set movie_number to my movieNum(top_result)
				
				-- get the movie web page from imdb using the movie number
				set search2Header to "http://www.imdb.com/title/"
				set movie_page to do shell script "curl " & quoted form of (search2Header & movie_number & "/") without altering line endings
			on error
				-- sometimes when you search for a movie title, instead of presenting you with a list of movies to pick from the website jumps you directly to the movie's web page. My script errors in these cases so it will do it the hard way and use Safari.
				try
					tell application "Safari"
						activate
						open location (search1Header & search_title)
						delay 1
						my web_page_loading()
						set thisurl to the URL of document 1
						tell application "System Events" to tell process "safari"
							keystroke "w" using command down
							keystroke "h" using command down
						end tell
					end tell
					set movie_page to do shell script "curl " & quoted form of thisurl without altering line endings
				end try
			end try
			
			-- strip out pertinent info from web page
			try
				set movie_title to do shell script "echo " & quoted form of movie_page & " | grep -i \"<title>\""
				set movie_title to parseTitle(movie_title)
				set movie_title to my stripYear(movie_title) -- sometimes the release year is added to the name of the movie
			on error
				set movie_title to "missing value"
			end try
			
			try
				set release_date to do shell script "echo " & quoted form of movie_page & " | grep -A 1 -i \"release date:\""
				set release_date to my parseReleaseDate(release_date)
			on error
				try
					set release_date to do shell script "echo " & quoted form of movie_page & " | grep -i \"Sections/Years\""
					set release_date to my parseReleaseDate2(release_date)
				on error
					set release_date to "missing value"
				end try
			end try
			
			try
				set the_genre to do shell script "echo " & quoted form of movie_page & " | grep -A 1 -i \"genre:\""
				set the_genre to my parseGenre(the_genre)
			on error
				set the_genre to "missing value"
			end try
			
			try
				set user_rating to do shell script "echo " & quoted form of movie_page & " | grep -A 1 -i \"<b>user rating:</b>\""
				set user_rating to my parseUserRating(user_rating)
			on error
				set user_rating to "missing value"
			end try
			
			try
				try
					set mpaa_rating to do shell script "echo " & quoted form of movie_page & " | grep -A 1 -i \"mpaa\""
					set mpaa_rating to my parseMPAARating(mpaa_rating)
				on error
					set mpaa_rating to do shell script "echo " & quoted form of movie_page & " | grep -A 1 -i \"certification:\" | grep -i \"usa\""
					set mpaa_rating to my parseCertificationRating(mpaa_rating)
				end try
			on error
				set mpaa_rating to "missing value"
			end try
			
			try
				try
					set plot_outline to do shell script "echo " & quoted form of movie_page & " | grep -A 1 -i \"plot outline:\""
				on error
					try
						set plot_outline to do shell script "echo " & quoted form of movie_page & " | grep -A 1 -i \"plot summary:\""
					on error
						set plot_outline to do shell script "echo " & quoted form of movie_page & " | grep -A 1 -i \"tagline:\""
					end try
				end try
				set plot_outline to my parsePlotOutline(plot_outline)
			on error
				set plot_outline to "missing value"
			end try
			
			try
				try
					set the_cast to do shell script "echo " & quoted form of movie_page & " | grep -i \"Cast overview, first billed only\""
				on error
					set the_cast to do shell script "echo " & quoted form of movie_page & " | grep -i \"Credited cast\""
				end try
				set the_cast to my castMembers(the_cast, 4)
				
			on error
				set the_cast to "missing value"
			end try
			
			-- fix html code in decimal unicode format ie. special characters in the form of ç
			set movie_title to my decHTML_to_string(movie_title)
			set release_date to my decHTML_to_string(release_date)
			set the_genre to my decHTML_to_string(the_genre)
			set user_rating to my decHTML_to_string(user_rating)
			set mpaa_rating to my decHTML_to_string(mpaa_rating)
			set plot_outline to my decHTML_to_string(plot_outline)
			set the_cast to my decHTML_to_string(the_cast)
			
			-- compile the results into a list of records
			set ann_records to {{ann_heading:"Full Name", ann_value:movie_title}, {ann_heading:"Copyright", ann_value:release_date}, {ann_heading:"Genre", ann_value:the_genre}, {ann_heading:"Warning", ann_value:user_rating}, {ann_heading:"Special Playback Requirements", ann_value:mpaa_rating}, {ann_heading:"Description", ann_value:plot_outline}, {ann_heading:"Performers", ann_value:the_cast}}
			
			-- display the dialog box to choose the next action
			set dialog_text to "Movie Title: " & movie_title & return & "Release Date: " & release_date & return & "Genre: " & the_genre & return & "User Rating: " & user_rating & return & "MPAA Rating: " & mpaa_rating & return & "Plot Outline: " & plot_outline & return & "The Cast: " & the_cast
			display dialog dialog_text buttons {"Cancel", "New Search", "Write Text File"} default button 1
			set buttonEntered to the button returned of result
			if buttonEntered is "New Search" then
				repeat
					display dialog "Type in a new movie title to search." default answer (item 1 of nmExt) with icon note buttons {"Cancel", "OK"} default button "OK"
					set {text_entered, button_pressed} to {text returned, button returned} of the result
					if text_entered is not "" then
						set movie_title to text_entered
						exit repeat
					end if
				end repeat
			else if buttonEntered is "Write Text File" then
				set target_file to text_file_folderPath & movie_file_name & ".txt"
				my writeTo(dialog_text, target_file, false, string)
				exit repeat
			end if
		end timeout
	end repeat
end repeat

(*====================== SUBROUTINES ==========================*)
on titleIMDB(movie_title)
	set text item delimiters to "."
	set search_title to text items of movie_title
	set text item delimiters to space
	set search_title to search_title as Unicode text
	set text item delimiters to "_"
	set search_title to text items of search_title
	set text item delimiters to space
	set search_title to search_title as Unicode text
	set search_title to text items of search_title
	set text item delimiters to "+"
	set search_title to search_title as Unicode text
	set text item delimiters to ""
	return search_title
end titleIMDB

on movieNum(the_string)
	set text item delimiters to "<a href=\"/title/"
	set first_cut to text items of the_string
	set part_result to item 2 of first_cut
	set text item delimiters to "/"
	set second_cut to text items of part_result
	set text item delimiters to ""
	set movie_number to item 1 of second_cut
	return movie_number
end movieNum

on parseTitle(movie_title)
	set text item delimiters to "<title>"
	set a to text items of movie_title
	set text item delimiters to ""
	set movie_title to a as Unicode text
	set text item delimiters to "</title>"
	set a to text items of movie_title
	set text item delimiters to ""
	set movie_title to a as Unicode text
	return movie_title
end parseTitle

on parseReleaseDate(release_date)
	set text item delimiters to return
	set a to text items of release_date
	set text item delimiters to ""
	set release_date to item 2 of a
	return release_date
end parseReleaseDate

on parseReleaseDate2(release_date)
	set text item delimiters to "</a>"
	set a to text items of release_date
	set text item delimiters to ""
	set release_date to characters -4 thru -1 of (item 1 of a) as Unicode text
	return release_date
end parseReleaseDate2

on parseGenre(the_genre)
	set remove_strings to {return, " / ", space, "><"}
	repeat with a_string in remove_strings
		set text item delimiters to a_string
		set a to text items of the_genre
		set text item delimiters to ""
		set the_genre to a as Unicode text
	end repeat
	set a to characters of the_genre
	set the_count to count of a
	set the_genre to {}
	repeat with i from 1 to the_count
		set i_char to item i of a
		if i_char is ">" then
			repeat with j from (i + 1) to the_count
				set j_char to item j of a
				if j_char is ":" or j_char is "=" then
					copy j + 1 to i
					exit repeat
				end if
				if j_char is "<" then
					set end of the_genre to (items (i + 1) thru (j - 1) of a) as Unicode text
					copy j + 1 to i
					exit repeat
				end if
			end repeat
		end if
	end repeat
	set text item delimiters to "," & space
	set the_genre to the_genre as Unicode text
	set text item delimiters to ""
	return the_genre
end parseGenre

on parseUserRating(user_rating)
	set text item delimiters to return --(ASCII character 10)
	set a to text items of user_rating
	set user_rating to item 2 of a
	set remove_strings to {space, "<b>", "</b>"}
	repeat with a_string in remove_strings
		set text item delimiters to a_string
		set a to text items of user_rating
		set text item delimiters to ""
		set user_rating to a as Unicode text
	end repeat
	return user_rating
end parseUserRating

on parseMPAARating(mpaa_rating)
	set text item delimiters to return
	set a to text items of mpaa_rating
	set text item delimiters to ""
	set mpaa_rating to item 2 of a
	return mpaa_rating
end parseMPAARating

on parseCertificationRating(mpaa_rating)
	set text item delimiters to "certificates=USA:"
	set a to text items of mpaa_rating
	set mpaa_rating to item 2 of a
	set text item delimiters to "&&heading="
	set a to text items of mpaa_rating
	set text item delimiters to ""
	set mpaa_rating to item 1 of a
	set mpaa_rating to "USA-" & mpaa_rating
	return mpaa_rating
end parseCertificationRating

on parsePlotOutline(plot_outline)
	set text item delimiters to return
	set a to text items of plot_outline
	set text item delimiters to ""
	set plot_outline to item 2 of a
	set text item delimiters to "<a class="
	set a to text items of plot_outline
	set text item delimiters to ""
	set plot_outline to item 1 of a
	return plot_outline
end parsePlotOutline

on castMembers(the_cast, how_many)
	set text item delimiters to "<td class=\"nm\">"
	set a to text items of the_cast
	set text item delimiters to ""
	if how_many > ((count of a) - 1) then set how_many to ((count of a) - 1)
	set cast_members to {}
	repeat with i from 2 to (how_many + 1)
		set end of cast_members to my castMember(item i of a)
	end repeat
	set text item delimiters to ", "
	set cast_string to cast_members as Unicode text
	set text item delimiters to ""
	if cast_string contains "<a href=" then
		set text item delimiters to "</a>"
		set b to text items of cast_string
		set text item delimiters to ""
		set c to b as string
		set text item delimiters to ""
		set i to 0
		set a to ""
		repeat until i is (count of c)
			set i to i + 1
			if item i of c is not "<" then
				set a to a & item i of c
			else
				repeat with j from (count of a) to (count of c)
					if item j of c is ">" then
						set i to j
						exit repeat
					end if
				end repeat
			end if
		end repeat
		set cast_string to a
	end if
	return cast_string
end castMembers

on castMember(the_string)
	set c to characters of the_string
	repeat with i from 1 to (count of c)
		set i_char to item i of c
		if i_char is ">" then
			repeat with j from (i + 1) to (count of c)
				set j_char to item j of c
				if j_char is "<" then
					set real_name to (items (i + 1) thru (j - 1) of c) as Unicode text
					exit repeat
				end if
			end repeat
			exit repeat
		end if
	end repeat
	set text item delimiters to "<td class=\"char\">"
	set d to text items of the_string
	set e to item 2 of d
	set text item delimiters to "</td>"
	set d to text items of e
	set text item delimiters to ""
	set char_name to item 1 of d
	set cast_member to real_name & " as " & char_name as Unicode text
	return cast_member
end castMember

on stripYear(movie_title)
	if movie_title contains "(" then
		set x to offset of "(" in movie_title
		if character (x + 5) of movie_title is ")" then
			if length of movie_title > (x + 5) then
				if character (x - 1) of movie_title is space then
					set movie_title to (characters 1 thru (x - 2) of movie_title & characters (x + 6) thru -1 of movie_title) as Unicode text
				else
					set movie_title to (characters 1 thru (x - 1) of movie_title & characters (x + 6) thru -1 of movie_title) as Unicode text
				end if
			else
				if character (x - 1) of movie_title is space then
					set movie_title to (characters 1 thru (x - 2) of movie_title) as Unicode text
				else
					set movie_title to (characters 1 thru (x - 1) of movie_title) as Unicode text
				end if
			end if
		end if
	end if
	return movie_title
end stripYear

on getName_andExtension(F)
	set F to F as Unicode text
	set {name:Nm, name extension:Ex} to info for file F
	if Ex is missing value then set Ex to ""
	if Ex is not "" then
		set Nm to text 1 thru ((count Nm) - (count Ex) - 1) of Nm
		set Ex to "." & Ex
	end if
	return {Nm, Ex}
end getName_andExtension

on web_page_loading()
	set theDelay to 10 -- the time in seconds the script will wait to let a web page load
	set numTries to 3 -- the number of stop/reload cycles before giving up
	set my_delay to 0.25
	set myCounter to 0
	set finished to false
	repeat until finished is true
		set startTime to current date
		set myCounter to myCounter + 1
		set web_page_is_loaded to false
		delay my_delay
		tell application "Safari"
			activate
			repeat until web_page_is_loaded is true
				-- check time and do this if 10 seconds hasn't elapsed
				delay 1
				if (startTime + theDelay) > (current date) then
					if name of window 1 contains "Loading" then
						delay my_delay
					else if name of window 1 contains "Untitled" then -- failed
						delay 2
						if name of window 1 contains "Untitled" then
							set web_page_is_loaded to true
							set finished to true
							set frontApp to getFrontApp() of frontAppLib
							tell application frontApp to display dialog "The web page will not load!"
						end if
					else if name of window 1 contains "Failed to open page" then
						tell application "System Events" to tell process "Safari"
							keystroke "." using command down -- stop the page
							delay my_delay
							keystroke "r" using command down -- reload the page
						end tell
						delay my_delay
						set web_page_is_loaded to true
					else
						delay my_delay * 6
						return true
					end if
				else -- if 10 seconds has elapsed then do this
					tell application "System Events" to tell process "Safari"
						-- if we tried 3 times then give up
						if myCounter is numTries then
							keystroke "." using command down -- stop the page
							return false
						else -- try again because we didn't try 3 times yet
							keystroke "." using command down -- stop the page
							delay my_delay
							keystroke "r" using command down -- reload the page
							delay my_delay
							set web_page_is_loaded to true
						end if
					end tell
				end if
			end repeat
		end tell
	end repeat
end web_page_loading

on decHTML_to_string(the_string)
	set {TIDs, text item delimiters} to {text item delimiters, "&#"}
	set b to text items of the_string
	set text item delimiters to TIDs
	set uniList to {item 1 of b}
	repeat with i from 2 to (count of b)
		set this_string to item i of b
		set string_count to count of this_string
		repeat with j from 1 to string_count
			if item j of this_string is ";" or item j of this_string is "\\" then
				set nDec to text 1 thru (j - 1) of this_string -- get the decimal value
				set nHex to do shell script "perl -e 'printf(\"%04X\", " & nDec & ")'" -- convert decimal to hex
				set uChar to run script "«data utxt" & nHex & "»" -- convert unicode hex to unicode character
				if string_count > j then
					set u_string to (uChar & (text (j + 1) thru string_count of this_string)) as Unicode text
				else
					set u_string to uChar
				end if
				set end of uniList to u_string
				exit repeat
			end if
		end repeat
	end repeat
	return uniList as Unicode text
end decHTML_to_string

on writeTo(this_data, target_file, append_data, mode) -- append_data is true or false, mode is string etc. (no quotes around either)
	try
		set target_file to target_file as Unicode text
		if target_file does not contain ":" then set target_file to POSIX file target_file as Unicode text
		set the open_target_file to open for access file target_file with write permission
		if append_data is false then set eof of the open_target_file to 0
		write this_data to the open_target_file starting at eof as mode
		close access the open_target_file
		return true
	on error
		try
			close access file target_file
		end try
		return false
	end try
end writeTo

FYI. I have updated the above script so that you can now select more than one movie file in the Finder. Previously the script only worked on one selected file but now you can select one or more files at a a time.

Hi Hank, thanks for this great script!! I’ve modified it a bit to suit my needs:

  • it searches the clipboard text, instead of selected file in finder (this is so before i download movies i don’t know via torrents, i can copy the title, run script via quicksilver shortcut, see movie info, and decide if I wanna download)

  • i modified the plot outline to be work better in some situations (when it would previously return “missing value”)

  • i added some extra “returns” on the display dialog so it was easier for me to read

  • i made the dialog box the foremost window when using Safari (which I’m always using for this application), though safari must be open

But, I’m having two issues, both minor, and I’m hoping you may be able to help out or give me some clues.

¢ First, 9 times out of 10, the cast listing is duplicating. For example, if I search “Casino” I get back the following cast listing:

  • Robert De Niro as Sam Ace Rothstein
  • Sharon Stone as Ginger McKenna
  • Joe Pesci as Ginger McKenna
  • Joe Pesci as Ginger McKenna
  • Joe Pesci as Nicky Santoro
  • James Woods as Nicky Santoro
  • James Woods as Lester Diamond
    …and so on.

¢ Second, if there any way to remove the " | " from “genre” just to make it neater?
(i.e. "Genre: Crime, |, Drama, more)

Thanks in advance for your help!


Here’s my version of the script:


tell application "System Events"
	set movie_title to (the clipboard as text)
end tell
-- display dialog movie_title

-- repeat with aFile in theFiles
-- get the title of the movie
--set movie_path to aFile as Unicode text
--set nmExt to my getName_andExtension(movie_path)
--set movie_file_name to item 1 of nmExt
--set movie_title to my stripYear(movie_file_name)

-- search IMDB and get movie metadata
repeat
	with timeout of 3600 seconds -- ie. do not time out for at least an hour
		-- setup the movie title into imdb search form ie. word1+word2+word3 etc
		set search_title to my titleIMDB(movie_title)
		try
			-- perform the search and find only the top result
			set search1Header to "http://www.imdb.com/find?s=tt&q="
			try
				set top_result to do shell script "curl " & quoted form of (search1Header & search_title) & " | grep -i \"popular titles\""
			on error
				set top_result to do shell script "curl " & quoted form of (search1Header & search_title) & " | grep -i \"exact matches\""
			end try
			
			-- obtain the movie number from the top result
			set movie_number to my movieNum(top_result)
			
			-- get the movie web page from imdb using the movie number
			set search2Header to "http://www.imdb.com/title/"
			set movie_page to do shell script "curl " & quoted form of (search2Header & movie_number & "/") without altering line endings
		on error
			-- sometimes when you search for a movie title, instead of presenting you with a list of movies to pick from the website jumps you directly to the movie's web page. My script errors in these cases so it will do it the hard way and use Safari.
			try
				tell application "Safari"
					activate
					open location (search1Header & search_title)
					delay 1
					my web_page_loading()
					set thisurl to the URL of document 1
					--tell application "System Events" to tell process "safari"
					--	keystroke "w" using command down
					--	keystroke "h" using command down
					--end tell
				end tell
				set movie_page to do shell script "curl " & quoted form of thisurl without altering line endings
			end try
		end try
		
		-- strip out pertinent info from web page
		try
			set movie_title to do shell script "echo " & quoted form of movie_page & " | grep -i \"<title>\""
			set movie_title to parseTitle(movie_title)
			set movie_title to my stripYear(movie_title) -- sometimes the release year is added to the name of the movie
		on error
			set movie_title to "missing value"
		end try
		
		try
			set release_date to do shell script "echo " & quoted form of movie_page & " | grep -A 1 -i \"release date:\""
			set release_date to my parseReleaseDate(release_date)
		on error
			try
				set release_date to do shell script "echo " & quoted form of movie_page & " | grep -i \"Sections/Years\""
				set release_date to my parseReleaseDate2(release_date)
			on error
				set release_date to "missing value"
			end try
		end try
		
		try
			set the_genre to do shell script "echo " & quoted form of movie_page & " | grep -A 1 -i \"genre:\""
			set the_genre to my parseGenre(the_genre)
		on error
			set the_genre to "missing value"
		end try
		
		try
			
			
			--		set user_rating to do shell script "echo " & quoted form of movie_page & " | grep -A 1 -i \"user rating:\""
			set user_rating to do shell script "echo " & quoted form of movie_page & " | grep -A 1 -i '<div class=\"meta\">'"
			set user_rating to my parseUserRating(user_rating)
		on error
			set user_rating to "missing value"
			--display dialog user_rating
		end try
		
		try
			try
				set mpaa_rating to do shell script "echo " & quoted form of movie_page & " | grep -A 1 -i \"mpaa\""
				set mpaa_rating to my parseMPAARating(mpaa_rating)
			on error
				set mpaa_rating to do shell script "echo " & quoted form of movie_page & " | grep -A 1 -i \"certification:\" | grep -i \"usa\""
				set mpaa_rating to my parseCertificationRating(mpaa_rating)
			end try
		on error
			set mpaa_rating to "n/a"
		end try
		
		try
			try
				set plot_outline to do shell script "echo " & quoted form of movie_page & " | grep -A 1 -i \"plot outline:\""
			on error
				try
					set plot_outline to do shell script "echo " & quoted form of movie_page & " | grep -A 1 -i \"plot:\""
				on error
					try
						set plot_outline to do shell script "echo " & quoted form of movie_page & " | grep -A 1 -i \"plot summary:\""
					on error
						set plot_outline to do shell script "echo " & quoted form of movie_page & " | grep -A 1 -i \"tagline:\""
					end try
				end try
			end try
			set plot_outline to my parsePlotOutline(plot_outline)
		on error
			set plot_outline to "missing value"
		end try
		
		try
			try
				set the_cast to do shell script "echo " & quoted form of movie_page & " | grep -i \"Cast overview, first billed only\""
			on error
				set the_cast to do shell script "echo " & quoted form of movie_page & " | grep -i \"Credited cast\""
			end try
			set the_cast to my castMembers(the_cast, 10)
			
		on error
			set the_cast to "missing value"
		end try
		
		-- fix html code in decimal unicode format ie. special characters in the form of ç
		set movie_title to my decHTML_to_string(movie_title)
		set release_date to my decHTML_to_string(release_date)
		set the_genre to my decHTML_to_string(the_genre)
		set user_rating to my decHTML_to_string(user_rating)
		set mpaa_rating to my decHTML_to_string(mpaa_rating)
		set plot_outline to my decHTML_to_string(plot_outline)
		set the_cast to my decHTML_to_string(the_cast)
		
		-- compile the results into a list of records
		set ann_records to {{ann_heading:"Full Name", ann_value:movie_title}, {ann_heading:"Copyright", ann_value:release_date}, {ann_heading:"Genre", ann_value:the_genre}, {ann_heading:"Warning", ann_value:user_rating}, {ann_heading:"Special Playback Requirements", ann_value:mpaa_rating}, {ann_heading:"Description", ann_value:plot_outline}, {ann_heading:"Performers", ann_value:the_cast}}
		
		-- display the dialog box to choose the next action
		tell application "System Events" to set myApp to name of first application process where frontmost is true
		tell application myApp
			set dialog_text to "Movie Title: " & movie_title & return & return & "Release Date: " & release_date & return & return & "Genre: " & the_genre & return & return & "User Rating: " & user_rating & return & return & "MPAA Rating: " & mpaa_rating & return & return & "Plot Outline: " & plot_outline & return & return & "The Cast: " & the_cast
			display dialog dialog_text buttons {"Cancel", "New Search", "Write Text File"} default button 1
			set buttonEntered to the button returned of result
			if buttonEntered is "New Search" then
				repeat
					display dialog "Type in a new movie title to search." default answer (item 1 of nmExt) with icon note buttons {"Cancel", "OK"} default button "OK"
					set {text_entered, button_pressed} to {text returned, button returned} of the result
					if text_entered is not "" then
						set movie_title to text_entered
						exit repeat
					end if
				end repeat
			else if buttonEntered is "Write Text File" then
				set target_file to text_file_folderPath & movie_file_name & ".txt"
				my writeTo(dialog_text, target_file, false, string)
				exit repeat
			end if
		end tell
	end timeout
end repeat
--end repeat

(*====================== SUBROUTINES ==========================*)
on titleIMDB(movie_title)
	set text item delimiters to "."
	set search_title to text items of movie_title
	set text item delimiters to space
	set search_title to search_title as Unicode text
	set text item delimiters to "_"
	set search_title to text items of search_title
	set text item delimiters to space
	set search_title to search_title as Unicode text
	set search_title to text items of search_title
	set text item delimiters to "+"
	set search_title to search_title as Unicode text
	set text item delimiters to ""
	return search_title
end titleIMDB

on movieNum(the_string)
	set text item delimiters to "<a href=\"/title/"
	set first_cut to text items of the_string
	set part_result to item 2 of first_cut
	set text item delimiters to "/"
	set second_cut to text items of part_result
	set text item delimiters to ""
	set movie_number to item 1 of second_cut
	return movie_number
end movieNum

on parseTitle(movie_title)
	set text item delimiters to "<title>"
	set a to text items of movie_title
	set text item delimiters to ""
	set movie_title to a as Unicode text
	set text item delimiters to "</title>"
	set a to text items of movie_title
	set text item delimiters to ""
	set movie_title to a as Unicode text
	return movie_title
end parseTitle

on parseReleaseDate(release_date)
	set text item delimiters to return
	set a to text items of release_date
	set text item delimiters to ""
	set release_date to item 2 of a
	return release_date
end parseReleaseDate

on parseReleaseDate2(release_date)
	set text item delimiters to "</a>"
	set a to text items of release_date
	set text item delimiters to ""
	set release_date to characters -4 thru -1 of (item 1 of a) as Unicode text
	return release_date
end parseReleaseDate2

on parseGenre(the_genre)
	set remove_strings to {return, " / ", space, "><"}
	repeat with a_string in remove_strings
		set text item delimiters to a_string
		set a to text items of the_genre
		set text item delimiters to ""
		set the_genre to a as Unicode text
	end repeat
	set a to characters of the_genre
	set the_count to count of a
	set the_genre to {}
	repeat with i from 1 to the_count
		set i_char to item i of a
		if i_char is ">" then
			repeat with j from (i + 1) to the_count
				set j_char to item j of a
				if j_char is ":" or j_char is "=" then
					copy j + 1 to i
					exit repeat
				end if
				if j_char is "<" then
					set end of the_genre to (items (i + 1) thru (j - 1) of a) as Unicode text
					copy j + 1 to i
					exit repeat
				end if
			end repeat
		end if
	end repeat
	set text item delimiters to "," & space
	set the_genre to the_genre as Unicode text
	set text item delimiters to ""
	return the_genre
end parseGenre

on parseUserRating(user_rating)
	set text item delimiters to return --(ASCII character 10)
	set a to text items of user_rating
	set user_rating to item 2 of a
	--	set remove_strings to {"<b>", "</b>", "<div>", "</div>", "<divid>", "<div id>", space}
	set remove_strings to {"<b>", "</b>", space}
	repeat with a_string in remove_strings
		set text item delimiters to a_string
		set a to text items of user_rating
		set text item delimiters to ""
		set user_rating to a as Unicode text
	end repeat
	return user_rating
end parseUserRating

on parseMPAARating(mpaa_rating)
	set text item delimiters to return
	set a to text items of mpaa_rating
	set text item delimiters to ""
	set mpaa_rating to item 2 of a
	return mpaa_rating
end parseMPAARating

on parseCertificationRating(mpaa_rating)
	set text item delimiters to "certificates=USA:"
	set a to text items of mpaa_rating
	set mpaa_rating to item 2 of a
	set text item delimiters to "&&heading="
	set a to text items of mpaa_rating
	set text item delimiters to ""
	set mpaa_rating to item 1 of a
	set mpaa_rating to "USA-" & mpaa_rating
	return mpaa_rating
end parseCertificationRating

on parsePlotOutline(plot_outline)
	set text item delimiters to return
	set a to text items of plot_outline
	set text item delimiters to ""
	set plot_outline to item 2 of a
	set text item delimiters to "<a class="
	set a to text items of plot_outline
	set text item delimiters to ""
	set plot_outline to item 1 of a
	return plot_outline
end parsePlotOutline

on castMembers(the_cast, how_many)
	set text item delimiters to "<td class=\"nm\">"
	set a to text items of the_cast
	set text item delimiters to ""
	if how_many > ((count of a) - 1) then set how_many to ((count of a) - 1)
	set cast_members to {}
	repeat with i from 2 to (how_many + 1)
		set end of cast_members to my castMember(item i of a)
	end repeat
	set text item delimiters to ""
	set cast_string to cast_members as Unicode text
	set text item delimiters to ""
	if cast_string contains "<a href=" then
		set text item delimiters to "</a>"
		set b to text items of cast_string
		set text item delimiters to ""
		set c to b as string
		set text item delimiters to ""
		set i to 0
		set a to ""
		repeat until i is (count of c)
			set i to i + 1
			if item i of c is not "<" then
				set a to a & item i of c
			else
				repeat with j from (count of a) to (count of c)
					if item j of c is ">" then
						set i to j
						exit repeat
					end if
				end repeat
			end if
		end repeat
		set cast_string to a
	end if
	return cast_string
end castMembers

on castMember(the_string)
	set c to characters of the_string
	repeat with i from 1 to (count of c)
		set i_char to item i of c
		if i_char is ">" then
			repeat with j from (i + 1) to (count of c)
				set j_char to item j of c
				if j_char is "<" then
					set real_name to (items (i + 1) thru (j - 1) of c) as Unicode text
					exit repeat
				end if
			end repeat
			exit repeat
		end if
	end repeat
	set text item delimiters to "<td class=\"char\">"
	set d to text items of the_string
	set e to item 2 of d
	set text item delimiters to "</td>"
	set d to text items of e
	set text item delimiters to ""
	set char_name to item 1 of d
	set cast_member to return & "¢ " & real_name & " as " & char_name as Unicode text
	return cast_member
end castMember



on stripYear(movie_title)
	if movie_title contains "(" then
		set x to offset of "(" in movie_title
		if character (x + 5) of movie_title is ")" then
			if length of movie_title > (x + 5) then
				if character (x - 1) of movie_title is space then
					set movie_title to (characters 1 thru (x - 2) of movie_title & characters (x + 6) thru -1 of movie_title) as Unicode text
				else
					set movie_title to (characters 1 thru (x - 1) of movie_title & characters (x + 6) thru -1 of movie_title) as Unicode text
				end if
			else
				if character (x - 1) of movie_title is space then
					set movie_title to (characters 1 thru (x - 2) of movie_title) as Unicode text
				else
					set movie_title to (characters 1 thru (x - 1) of movie_title) as Unicode text
				end if
			end if
		end if
	end if
	return movie_title
end stripYear

on getName_andExtension(F)
	set F to F as Unicode text
	set {name:Nm, name extension:Ex} to info for file F
	if Ex is missing value then set Ex to ""
	if Ex is not "" then
		set Nm to text 1 thru ((count Nm) - (count Ex) - 1) of Nm
		set Ex to "." & Ex
	end if
	return {Nm, Ex}
end getName_andExtension

on web_page_loading()
	set theDelay to 10 -- the time in seconds the script will wait to let a web page load
	set numTries to 3 -- the number of stop/reload cycles before giving up
	set my_delay to 0.25
	set myCounter to 0
	set finished to false
	repeat until finished is true
		set startTime to current date
		set myCounter to myCounter + 1
		set web_page_is_loaded to false
		delay my_delay
		tell application "Safari"
			activate
			repeat until web_page_is_loaded is true
				-- check time and do this if 10 seconds hasn't elapsed
				delay 1
				if (startTime + theDelay) > (current date) then
					if name of window 1 contains "Loading" then
						delay my_delay
					else if name of window 1 contains "Untitled" then -- failed
						delay 2
						if name of window 1 contains "Untitled" then
							set web_page_is_loaded to true
							set finished to true
							set frontApp to getFrontApp() of frontAppLib
							tell application frontApp to display dialog "The web page will not load!"
						end if
					else if name of window 1 contains "Failed to open page" then
						tell application "System Events" to tell process "Safari"
							keystroke "." using command down -- stop the page
							delay my_delay
							keystroke "r" using command down -- reload the page
						end tell
						delay my_delay
						set web_page_is_loaded to true
					else
						delay my_delay * 6
						return true
					end if
				else -- if 10 seconds has elapsed then do this
					tell application "System Events" to tell process "Safari"
						-- if we tried 3 times then give up
						if myCounter is numTries then
							keystroke "." using command down -- stop the page
							return false
						else -- try again because we didn't try 3 times yet
							keystroke "." using command down -- stop the page
							delay my_delay
							keystroke "r" using command down -- reload the page
							delay my_delay
							set web_page_is_loaded to true
						end if
					end tell
				end if
			end repeat
		end tell
	end repeat
end web_page_loading

on decHTML_to_string(the_string)
	set {TIDs, text item delimiters} to {text item delimiters, "&#"}
	set b to text items of the_string
	set text item delimiters to TIDs
	set uniList to {item 1 of b}
	repeat with i from 2 to (count of b)
		set this_string to item i of b
		set string_count to count of this_string
		repeat with j from 1 to string_count
			if item j of this_string is ";" or item j of this_string is "\\" then
				set nDec to text 1 thru (j - 1) of this_string -- get the decimal value
				set nHex to do shell script "perl -e 'printf(\"%04X\", " & nDec & ")'" -- convert decimal to hex
				set uChar to run script "«data utxt" & nHex & "»" -- convert unicode hex to unicode character
				if string_count > j then
					set u_string to (uChar & (text (j + 1) thru string_count of this_string)) as Unicode text
				else
					set u_string to uChar
				end if
				set end of uniList to u_string
				exit repeat
			end if
		end repeat
	end repeat
	return uniList as Unicode text
end decHTML_to_string

on writeTo(this_data, target_file, append_data, mode) -- append_data is true or false, mode is string etc. (no quotes around either)
	try
		set target_file to target_file as Unicode text
		if target_file does not contain ":" then set target_file to POSIX file target_file as Unicode text
		set the open_target_file to open for access file target_file with write permission
		if append_data is false then set eof of the open_target_file to 0
		write this_data to the open_target_file starting at eof as mode
		close access the open_target_file
		return true
	on error
		try
			close access file target_file
		end try
		return false
	end try
end writeTo

Attached the up to date version of the primary script (as of 25.01.2010). I’ve change the dialog text formatting a bit.


(* This script writes the metadata for movie files to text files by searching the IMDB website. You get one text file for each movie file selected in the Finder *)
(* Please note that this script parses the html code from the IMDB website. This is a particularly difficult task because the format and therefore the html code of the website changes from time to time. As such this script may not always work. *)

--how to use the script--
-- 1. set the variable "movie_poster_folderPath" to the path where you want the text file to be saved
-- 2. select one or more movie files in the front Finder window and run this script
-- 3. the name of the movie is derived from the name of the selected file, the IMDB (internet movie database) is searched for the movie, from the search results the URL of the movie web page is derived, then the text of the movie web page is searched for its metadata. Once metadata is found a dialog box will be presented to you where you can verify that the metadata is correct. If it is not correct then you can enter new search terms in the dialog box and perform a new search. Once you decide that the metadata is correct press the "Write File" button and the text file is written to the path you have set in the script.
-- 4. In cases where the IMDB search does not return appropriate results then the script will try to use Safari to perform the search, so at times you may see Safari launch and some windows open and close in Safari during this process.

--a tip--
-- sometimes the script cannot accurately get the metadata from the IMDB search (maybe the file name isn't correct etc.), as you'll note when the meatadata is presented to you in the dialog box. If after a couple searches you cannot find the correct metadata then I found that the best solution is to manually search the IMDB website yourself. Once you find the proper web page for the movie then copy the movie number of that web page from the URL bar at the top of the Safari window. That movie number can then be inserted into a new search in the metadata dialog box and the proper metadata should be found.

-- get the selected files
tell application "Finder" to set theFiles to the selection

repeat with aFile in theFiles
	-- get the title of the movie
	set movie_path to aFile as Unicode text
	set nmExt to my getName_andExtension(movie_path)
	set movie_file_name to item 1 of nmExt
	set movie_title to my stripYear(movie_file_name)
	set text_file_folderPath to "~/Desktop/imdb_query_result.txt"
	
	-- search IMDB and get movie metadata
	repeat
		with timeout of 3600 seconds -- ie. do not time out for at least an hour
			-- setup the movie title into imdb search form ie. word1+word2+word3 etc
			set search_title to my titleIMDB(movie_title)
			try
				-- perform the search and find only the top result
				set search1Header to "http://www.imdb.com/find?s=tt&q="
				try
					set top_result to do shell script "curl " & quoted form of (search1Header & search_title) & " | grep -i \"popular titles\""
				on error
					set top_result to do shell script "curl " & quoted form of (search1Header & search_title) & " | grep -i \"exact matches\""
				end try
				
				-- obtain the movie number from the top result
				set movie_number to my movieNum(top_result)
				
				-- get the movie web page from imdb using the movie number
				set search2Header to "http://www.imdb.com/title/"
				set movie_page to do shell script "curl " & quoted form of (search2Header & movie_number & "/") without altering line endings
			on error
				-- sometimes when you search for a movie title, instead of presenting you with a list of movies to pick from the website jumps you directly to the movie's web page. My script errors in these cases so it will do it the hard way and use Safari.
				try
					tell application "Safari"
						activate
						open location (search1Header & search_title)
						delay 1
						my web_page_loading()
						set thisurl to the URL of document 1
						tell application "System Events" to tell process "safari"
							keystroke "w" using command down
							keystroke "h" using command down
						end tell
					end tell
					set movie_page to do shell script "curl " & quoted form of thisurl without altering line endings
				end try
			end try
			
			-- strip out pertinent info from web page
			try
				set movie_title to do shell script "echo " & quoted form of movie_page & " | grep -i \"<title>\""
				set movie_title to parseTitle(movie_title)
				set movie_title to my stripYear(movie_title) -- sometimes the release year is added to the name of the movie
			on error
				set movie_title to "missing value"
			end try
			
			try
				set release_date to do shell script "echo " & quoted form of movie_page & " | grep -A 2 -i \"release date:\""
				set release_date to my parseReleaseDate(release_date)
			on error
				try
					set release_date to do shell script "echo " & quoted form of movie_page & " | grep -i \"Sections/Years\""
					set release_date to my parseReleaseDate2(release_date)
				on error
					set release_date to "missing value"
				end try
			end try
			
			try
				set the_genre to do shell script "echo " & quoted form of movie_page & " | grep -A 2 -i \"genre:\""
				set the_genre to my parseGenre(the_genre)
			on error
				set the_genre to "missing value"
			end try
			
			try
				set user_rating to do shell script "echo " & quoted form of movie_page & " | grep -A 1 -i \"starbar-meta\""
				set user_rating to my parseUserRating(user_rating)
			on error
				set user_rating to "missing value"
			end try
			
			try
				try
					set mpaa_rating to do shell script "echo " & quoted form of movie_page & " | grep -A 2 -i \"mpaa\""
					set mpaa_rating to my parseMPAARating(mpaa_rating)
				on error
					set mpaa_rating to do shell script "echo " & quoted form of movie_page & " | grep -A 2 -i \"certification:\" | grep -i \"usa\""
					set mpaa_rating to my parseCertificationRating(mpaa_rating)
				end try
			on error
				set mpaa_rating to "missing value"
			end try
			
			try
				try
					set plot_outline to do shell script "echo " & quoted form of movie_page & " | grep -A 2 -i \"plot:\""
				on error
					try
						set plot_outline to do shell script "echo " & quoted form of movie_page & " | grep -A 2 -i \"plot summary:\""
					on error
						set plot_outline to do shell script "echo " & quoted form of movie_page & " | grep -A 2 -i \"tagline:\""
					end try
				end try
				set plot_outline to my parsePlotOutline(plot_outline)
			on error
				set plot_outline to "missing value"
			end try
			
			try
				try
					set the_cast to do shell script "echo " & quoted form of movie_page & " | grep -i \"Cast overview, first billed only\""
				on error
					set the_cast to do shell script "echo " & quoted form of movie_page & " | grep -i \"Credited cast\""
				end try
				set the_cast to my castMembers(the_cast, 4)
				
			on error
				set the_cast to "missing value"
			end try
			
			-- fix html code in decimal unicode format ie. special characters in the form of ç
			set movie_title to my decHTML_to_string(movie_title)
			set release_date to my decHTML_to_string(release_date)
			set the_genre to my decHTML_to_string(the_genre)
			set user_rating to my decHTML_to_string(user_rating)
			set mpaa_rating to my decHTML_to_string(mpaa_rating)
			set plot_outline to my decHTML_to_string(plot_outline)
			set the_cast to my decHTML_to_string(the_cast)
			
			-- compile the results into a list of records
			set ann_records to {{ann_heading:"Full Name", ann_value:movie_title}, {ann_heading:"Copyright", ann_value:release_date}, {ann_heading:"Genre", ann_value:the_genre}, {ann_heading:"Warning", ann_value:user_rating}, {ann_heading:"Special Playback Requirements", ann_value:mpaa_rating}, {ann_heading:"Description", ann_value:plot_outline}, {ann_heading:"Performers", ann_value:the_cast}}
			
			-- display the dialog box to choose the next action
			set dialog_text to "Movie Title: " & return & movie_title & return & return & "Release Date: " & return & release_date & return & return & "Genre: " & return & the_genre & return & return & "User Rating: " & return & user_rating & return & return & "MPAA Rating: " & return & mpaa_rating & return & return & "Plot Outline: " & return & plot_outline & return & return & "The Cast: " & return & the_cast
			display dialog dialog_text buttons {"Cancel", "New Search", "Write Text File"} default button 1
			set buttonEntered to the button returned of result
			if buttonEntered is "New Search" then
				repeat
					display dialog "Type in a new movie title to search." default answer (item 1 of nmExt) with icon note buttons {"Cancel", "OK"} default button "OK"
					set {text_entered, button_pressed} to {text returned, button returned} of the result
					if text_entered is not "" then
						set movie_title to text_entered
						exit repeat
					end if
				end repeat
			else if buttonEntered is "Write Text File" then
				set target_file to text_file_folderPath & movie_file_name & ".txt"
				my writeTo(dialog_text, target_file, false, string)
				exit repeat
			end if
		end timeout
	end repeat
end repeat

(*====================== SUBROUTINES ==========================*)
on titleIMDB(movie_title)
	set text item delimiters to "."
	set search_title to text items of movie_title
	set text item delimiters to space
	set search_title to search_title as Unicode text
	set text item delimiters to "_"
	set search_title to text items of search_title
	set text item delimiters to space
	set search_title to search_title as Unicode text
	set search_title to text items of search_title
	set text item delimiters to "+"
	set search_title to search_title as Unicode text
	set text item delimiters to ""
	return search_title
end titleIMDB

on movieNum(the_string)
	set text item delimiters to "<a href=\"/title/"
	set first_cut to text items of the_string
	set part_result to item 2 of first_cut
	set text item delimiters to "/"
	set second_cut to text items of part_result
	set text item delimiters to ""
	set movie_number to item 1 of second_cut
	return movie_number
end movieNum

on parseTitle(movie_title)
	set text item delimiters to "<title>"
	set a to text items of movie_title
	set text item delimiters to ""
	set movie_title to a as Unicode text
	set text item delimiters to "</title>"
	set a to text items of movie_title
	set text item delimiters to ""
	set movie_title to a as Unicode text
	return movie_title
end parseTitle

on parseReleaseDate(release_date)
	set text item delimiters to return
	set a to text items of release_date
	set text item delimiters to ""
	set release_date to item 3 of a
	return release_date
end parseReleaseDate

on parseReleaseDate2(release_date)
	set text item delimiters to "</a>"
	set a to text items of release_date
	set text item delimiters to ""
	set release_date to characters -4 thru -1 of (item 1 of a) as Unicode text
	return release_date
end parseReleaseDate2

on parseGenre(the_genre)
	set remove_strings to {return, " / ", space, "|", "><"}
	repeat with a_string in remove_strings
		set text item delimiters to a_string
		set a to text items of the_genre
		set text item delimiters to ""
		set the_genre to a as Unicode text
	end repeat
	set a to characters of the_genre
	set the_count to count of a
	set the_genre to {}
	repeat with i from 1 to the_count
		set i_char to item i of a
		if i_char is ">" then
			repeat with j from (i + 1) to the_count
				set j_char to item j of a
				if j_char is ":" or j_char is "=" then
					copy j + 1 to i
					exit repeat
				end if
				if j_char is "<" then
					set end of the_genre to (items (i + 1) thru (j - 1) of a) as Unicode text
					copy j + 1 to i
					exit repeat
				end if
			end repeat
		end if
	end repeat
	set text item delimiters to "," & space
	set the_genre to the_genre as Unicode text
	set text item delimiters to ""
	return the_genre
end parseGenre

on parseUserRating(user_rating)
	set text item delimiters to return --(ASCII character 10)
	set a to text items of user_rating
	set user_rating to item 2 of a
	set remove_strings to {space, "<b>", "</b>"}
	repeat with a_string in remove_strings
		set text item delimiters to a_string
		set a to text items of user_rating
		set text item delimiters to ""
		set user_rating to a as Unicode text
	end repeat
	return user_rating
end parseUserRating

on parseMPAARating(mpaa_rating)
	set text item delimiters to return
	set a to text items of mpaa_rating
	set text item delimiters to ""
	set mpaa_rating to item 3 of a
	return mpaa_rating
end parseMPAARating

on parseCertificationRating(mpaa_rating)
	set text item delimiters to "certificates=USA:"
	set a to text items of mpaa_rating
	set mpaa_rating to item 2 of a
	set text item delimiters to "&&heading="
	set a to text items of mpaa_rating
	set text item delimiters to ""
	set mpaa_rating to item 1 of a
	set mpaa_rating to "USA-" & mpaa_rating
	return mpaa_rating
end parseCertificationRating

on parsePlotOutline(plot_outline)
	set text item delimiters to return
	set a to text items of plot_outline
	set text item delimiters to ""
	set plot_outline to item 3 of a
	set text item delimiters to "<a class="
	set a to text items of plot_outline
	set text item delimiters to ""
	set plot_outline to item 1 of a
	return plot_outline
end parsePlotOutline

on castMembers(the_cast, how_many)
	set text item delimiters to "<td class=\"nm\">"
	set a to text items of the_cast
	set text item delimiters to ""
	if how_many > ((count of a) - 1) then set how_many to ((count of a) - 1)
	set cast_members to {}
	repeat with i from 2 to (how_many + 1)
		set end of cast_members to my castMember(item i of a)
	end repeat
	set text item delimiters to ", "
	set cast_string to cast_members as Unicode text
	set text item delimiters to ""
	if cast_string contains "<a href=" then
		set text item delimiters to "</a>"
		set b to text items of cast_string
		set text item delimiters to ""
		set c to b as string
		set text item delimiters to ""
		set i to 0
		set a to ""
		repeat until i is (count of c)
			set i to i + 1
			if item i of c is not "<" then
				set a to a & item i of c
			else
				repeat with j from (count of a) to (count of c)
					if item j of c is ">" then
						set i to j
						exit repeat
					end if
				end repeat
			end if
		end repeat
		set cast_string to a
	end if
	return cast_string
end castMembers

on castMember(the_string)
	set c to characters of the_string
	repeat with i from 1 to (count of c)
		set i_char to item i of c
		if i_char is ">" then
			repeat with j from (i + 1) to (count of c)
				set j_char to item j of c
				if j_char is "<" then
					set real_name to (items (i + 1) thru (j - 1) of c) as Unicode text
					exit repeat
				end if
			end repeat
			exit repeat
		end if
	end repeat
	set text item delimiters to "<td class=\"char\">"
	set d to text items of the_string
	set e to item 2 of d
	set text item delimiters to "</td>"
	set d to text items of e
	set text item delimiters to ""
	set char_name to item 1 of d
	set cast_member to real_name & " as " & char_name as Unicode text
	return cast_member
end castMember

on stripYear(movie_title)
	if movie_title contains "(" then
		set x to offset of "(" in movie_title
		if character (x + 5) of movie_title is ")" then
			if length of movie_title > (x + 5) then
				if character (x - 1) of movie_title is space then
					set movie_title to (characters 1 thru (x - 2) of movie_title & characters (x + 6) thru -1 of movie_title) as Unicode text
				else
					set movie_title to (characters 1 thru (x - 1) of movie_title & characters (x + 6) thru -1 of movie_title) as Unicode text
				end if
			else
				if character (x - 1) of movie_title is space then
					set movie_title to (characters 1 thru (x - 2) of movie_title) as Unicode text
				else
					set movie_title to (characters 1 thru (x - 1) of movie_title) as Unicode text
				end if
			end if
		end if
	end if
	return movie_title
end stripYear

on getName_andExtension(F)
	set F to F as Unicode text
	set {name:Nm, name extension:Ex} to info for file F
	if Ex is missing value then set Ex to ""
	if Ex is not "" then
		set Nm to text 1 thru ((count Nm) - (count Ex) - 1) of Nm
		set Ex to "." & Ex
	end if
	return {Nm, Ex}
end getName_andExtension

on web_page_loading()
	set theDelay to 10 -- the time in seconds the script will wait to let a web page load
	set numTries to 3 -- the number of stop/reload cycles before giving up
	set my_delay to 0.25
	set myCounter to 0
	set finished to false
	repeat until finished is true
		set startTime to current date
		set myCounter to myCounter + 1
		set web_page_is_loaded to false
		delay my_delay
		tell application "Safari"
			activate
			repeat until web_page_is_loaded is true
				-- check time and do this if 10 seconds hasn't elapsed
				delay 1
				if (startTime + theDelay) > (current date) then
					if name of window 1 contains "Loading" then
						delay my_delay
					else if name of window 1 contains "Untitled" then -- failed
						delay 2
						if name of window 1 contains "Untitled" then
							set web_page_is_loaded to true
							set finished to true
							set frontApp to getFrontApp() of frontAppLib
							tell application frontApp to display dialog "The web page will not load!"
						end if
					else if name of window 1 contains "Failed to open page" then
						tell application "System Events" to tell process "Safari"
							keystroke "." using command down -- stop the page
							delay my_delay
							keystroke "r" using command down -- reload the page
						end tell
						delay my_delay
						set web_page_is_loaded to true
					else
						delay my_delay * 6
						return true
					end if
				else -- if 10 seconds has elapsed then do this
					tell application "System Events" to tell process "Safari"
						-- if we tried 3 times then give up
						if myCounter is numTries then
							keystroke "." using command down -- stop the page
							return false
						else -- try again because we didn't try 3 times yet
							keystroke "." using command down -- stop the page
							delay my_delay
							keystroke "r" using command down -- reload the page
							delay my_delay
							set web_page_is_loaded to true
						end if
					end tell
				end if
			end repeat
		end tell
	end repeat
end web_page_loading

on decHTML_to_string(the_string)
	set {TIDs, text item delimiters} to {text item delimiters, "&#"}
	set b to text items of the_string
	set text item delimiters to TIDs
	set uniList to {item 1 of b}
	repeat with i from 2 to (count of b)
		set this_string to item i of b
		set string_count to count of this_string
		repeat with j from 1 to string_count
			if item j of this_string is ";" or item j of this_string is "\\" then
				set nDec to text 1 thru (j - 1) of this_string -- get the decimal value
				set nHex to do shell script "perl -e 'printf(\"%04X\", " & nDec & ")'" -- convert decimal to hex
				set uChar to run script "«data utxt" & nHex & "»" -- convert unicode hex to unicode character
				if string_count > j then
					set u_string to (uChar & (text (j + 1) thru string_count of this_string)) as Unicode text
				else
					set u_string to uChar
				end if
				set end of uniList to u_string
				exit repeat
			end if
		end repeat
	end repeat
	return uniList as Unicode text
end decHTML_to_string

on writeTo(this_data, target_file, append_data, mode) -- append_data is true or false, mode is string etc. (no quotes around either)
	try
		set target_file to target_file as Unicode text
		if target_file does not contain ":" then set target_file to POSIX file target_file as Unicode text
		set the open_target_file to open for access file target_file with write permission
		if append_data is false then set eof of the open_target_file to 0
		write this_data to the open_target_file starting at eof as mode
		close access the open_target_file
		return true
	on error
		try
			close access file target_file
		end try
		return false
	end try
end writeTo