stripping variable text between trigger characters in filenames

I have a contract to design a magazine of mugshots. Right now, I’m doing a huge amount of manual data organization. It seems natural to use Applescript to automate this

The files come to me named in a relatively standard format of firstname-space-lastname-dash-offense. My client is inconsistent in separating the name from the offense, however (there can be one to three dashes, and he often leaves out the spaces), and also in how he writes the same offenses (i.e. “DUS,” driving under susp.," “drv. und. susp.,” "no DL, “no drivers license,” and many other variations).

The main goal is to strip the offenses off of the file names, preferably copying the files to a new folder in the process. It seems like this should be easy, since the offense will always start with a dash and end with the dot of the file extension. Because the offenses are all different, a straight find-and-replace won’t work to do this, though. My initial attempt at doing this, however, keeps raising typecasting errors when I step through the filename character by character. Does anyone already have or know of an existing script that will recognize everything between two “flag” characters, so I’m not reinventing the wheel here?

Beyond that, I’d like to be able to clean up sort the data, then sort it by offense and last name in text files. Ultimately, I want to create a database of offenses that can be ranked, because we try to organize the pages of mugshots with the most serious offenses first, going down in severity. Any ideas on how to script this part of the project would be much appreciated.

Model: PM G4 dual 533
AppleScript: 1.10.7
Browser: Firefox 3.0.7
Operating System: Mac OS X (10.4)


this strips off everything between the first dash and the name extension

set theName to "firstname lastname-offense.ext"
set {TID, text item delimiters} to {text item delimiters, "."}
set Ex to last text item of theName
set text item delimiters to "-"
set theName to text item 1 of theName & "." & Ex
set text item delimiters to TID
theName --> "firstname lastname.ext"

Thanks! That helps a LOT. I’ve already got a script to clean up the extra spaces and dashes. I can adapt your code to do pretty much everything else I need done.

I never thought of using item delimiters. Duh! :o)