Applescript iTunes with shell script and sed - please help

iTunes does a poor job of handling classical music files. As a result, I have hundreds of files that I want to modify. One example of such a modification would be to eliminate the opus number from the title of the track.

The opus numbers are all formatted somewhat differently. Here are several examples:
Sym #4 in E Min, Op. 98: 3. - Allegro Giocoso
Sym #1 in C. minor, Op. 11: Allegro Con Fuoco
Sym #1 in D Maj op. #22 - (“Titan”): 2b. Trio: Recht Gemächlich
Sym #2 in D Maj, Op. no. 73: Allegro non Troppo
Sym #5 (C min) Op. 67

I am a relative newbie at regex, but on various online simulators, the following seems to work for all but the last one
[- .,:]{1,4}[Oo][Pp][.]? #?([Nn]o[.]? ?)?\d*?([- .,:]{1,3})

in Applescript, I have attempted to use:


-- from Doug's Applescripts
set myTitle to the name of theTrack
set newTitle to (do shell script "echo \"" & myTitle & "\" | sed -r '" & SED & "'")

and


set thisTitle to aTrack's name
set TitleString to (do shell script "echo " & thisTitle & " | sed  's/" & RegexOpus & "/ /g'")

where RegexOpus is the string above.

the output for the various lines is consistently = “Sym”. This is not the output on any regex simulator I have been able to find.

Anyone able to tell me what is going on? Is there a debugger for sed?
Finally, can anyone give me the modification to the regex so that it will also match the last example, where the opus number is at the end of the string?

Thanks,
Eric

Hey Eric,

FYI. When you post something like this it’s helpful if you can post a complete script for people to open. The easier you make it for folks to help you the more likely you’ll get a response.


set _text to quoted form of text 2 thru -2 of "
Sym #4 in E Min, Op. 98: 3. - Allegro Giocoso
Sym #1 in C. minor, Op. 11: Allegro Con Fuoco
Sym #1 in D Maj op. #22 - (\"Titan\"): 2b. Trio: Recht Gemächlich
Sym #2 in D Maj, Op. no. 73: Allegro non Troppo    
Sym #5 (C min) Op. 67
"
set _cmd to "<<< " & _text & " sed -E 's![ ,]*[oO][pP][[:punct:][:space:]]*([nN][oO])?[[:punct:][:space:]]*[0-9]+[0-9 ;:.-]*! !'"
do shell script _cmd

Oddly the sed line from Doug has the -r (extended regex) switch. That’s not available in the 2005 BSD version of sed that comes with OSX. So unless you’ve used a package installer like MacPorts to install gsed yourself (as I have), it won’t work.

-E is the equivalent switch for the BSD sed.

Using an online regular expression analyzer can help you compose a regex, but it’s a world away from understanding the different flavors of regex and how to apply them properly via different tools - sed, Perl, a text editor like BBEdit or TextWrangler…

It can make you pull your hair out - especially when you first start fooling with them. :slight_smile:

You say you have hundreds of files to modify. If you’re willing to install the Satimage.osax or Shane’s ASObjC_Runner you’ll have a bit smoother experience with regex and Applescript.

You don’t have the time overhead of shelling out to Unix, and you don’t have to worry about using quoted form of to protect your text-input from the shell.

Here’s an example using the Satimage.osax:


set _text to text 2 thru -2 of "
Sym #4 in E Min, Op. 98: 3. - Allegro Giocoso
Sym #1 in C. minor, Op. 11: Allegro Con Fuoco
Sym #1 in D Maj op. #22 - (\"Titan\"): 2b. Trio: Recht Gemächlich
Sym #2 in D Maj, Op. no. 73: Allegro non Troppo    
Sym #5 (C min) Op. 67
"
set newTitles to change "[ ,]*Op[[:punct:][:space:]]*(NO)?[[:punct:][:space:]]*[0-9]+[0-9 ;:.-]*" into " " in _text ¬
	with regexp without case sensitive

The syntax is a bit simpler, because you can turn off case-sensitivity. (You can also do this in gsed.)

Shane will probably drop in and provide syntax for his ASObjC_Runner too.

One thing you should do is copy a pot-load of titles and run your script on the text. Examine this for consistency before you run on the actual music files.

Regex is finicky, so the bigger your test sample the better.

Oh no he won’t :wink: I’m very much a regex part-timer.

What I will say, though, is that Runner’s regex command happily accepts a list of strings in one gulp, which can simplify the searching a bit.

:smiley:

The Satimage.osax will do this too. As input it will take: a string, a list of strings, or a file (the file will be edited in place).

Cool, although I’m a little apprehensive about the change files in place bit. (I ummed and arred about doing the same, but my nerves got the better of me so I return the changed contents as a string.)

Sed will edit in place with the -i switch…

I like having the option.

Here’s a go with ASObjC_Runner:


set _text to text 2 thru -2 of "
Sym #4 in E Min, Op. 98: 3. - Allegro Giocoso
Sym #1 in C. minor, Op. 11: Allegro Con Fuoco
Sym #1 in D Maj op. #22 - (\"Titan\"): 2b. Trio: Recht Gemächlich
Sym #2 in D Maj, Op. no. 73: Allegro non Troppo    
Sym #5 (C min) Op. 67
"
tell application "ASObjC Runner"
	set _text to look for "[ ,]*Op[[:punct:][:space:]]*(NO)?[[:punct:][:space:]]*[0-9]+[0-9 ;:.-]*" in _text ¬
		replacing with " " options "i"
end tell

The syntax was easier than I remembered. :slight_smile:

Yes, I should have added the option (I was trying to avoid extra parameters, though, and dealing with permissions errors). It’s having it as the default that kind of throws me.

:slight_smile: I must confess I never came to grips with the POSIX [[: :]] thing – always looks like some kind of inscrutable emoticon to me :expressionless:

Hi Chris.

This is shorter and gives better results:


set _text to "Sym #4 in E Min, Op. 98: 3. - Allegro Giocoso
Sym #1 in C. minor, Op. 11: Allegro Con Fuoco
Sym #1 in D Maj op. #22 - (\"Titan\"): 2b. Trio: Recht Gemächlich
Sym #2 in D Maj, Op. no. 73a: Allegro non Troppo    
Sym #5 (C min) Op. 67"
set newTitles to change "[ ,]+Op([. ]|us )[^0-9]*[0-9][^:]*" into "" in _text ¬
	with regexp without case sensitive

It needs alternatives to the “opus” variations for composers like Bach and Mozart.

Edit: Or, for Mac OS X’s built-in ‘sed’:


set _text to "Sym #4 in E Min, Op. 98: 3. - Allegro Giocoso
Sym #1 in C. minor, Op. 11: Allegro Con Fuoco
Sym #1 in D Maj op. #22 - (\"Titan\"): 2b. Trio: Recht Gemächlich
Sym #2 in D Maj, Op. no. 73a: Allegro non Troppo    
Sym #5 (C min) Op. 67"

set newTitles to (do shell script ("<<<" & quoted form of _text & " sed -E 's/[ ,]+[Oo]p([. ]|us )[^0-9]*[0-9][^:]*//'"))

Wow!
Thank you all. A few comments. First, the ability to accept multiple lines of input in one go is useful for manipulation of text in a file, but not as useful for manipulating the titles of iTunes tracks. As far as I know, the only way that I can modify the titles of the tracks is to use “repeat with aTrack in sel” type loop – is there some way to feed all the titles into sed at once?

Second, I had looked at the Saltimage and other installations, but rejected them as solutions because I may very well distribute the applescript I am writing for classical music lovers everywhere to use, and I don’t want to have to rely on everyone installing something beyond the script.

Third, I just want to make sure I understand Nigel’s regex:

set newTitles to (do shell script (“<<<” & quoted form of _text & " sed -E ‘s/[ ,]+[Oo]p([. ]|us )[^0-9]*[0-9][^:]+//’"))

[ ,]+ → matches 1 or more space or comma (and at least one is required, followed by
[Oo]p → either capital or lowercase O and a lowercase p
([. ]|us ) → either a period, a space or "us " with us lowercase only
[^0-9]* → any number of characters that are not digits
[0-9][^:]+ → huh? 1 digit + any number of characters not a colon?

Fourth, is there a way to prototype this on OSX? The online simulators, as @ccstone points out, appear to be various non-OSX flavors. When I input this regex into the online simulators, for example, they show it capturing
Sym #1 in D Maj op. #22 - ("Titan"): 2b. Trio: Recht Gemächlich
from the space before “op.” through (“Titan”). I want to keep “Titan”, but more importantly, I need a way to test this on OSX so that I can see the effects of my own expressions before running them.

Finally, would this modification of Nigel’s regex work? What I am attempting is to (a) allow for any punctuation including a space to begin (e.g., “Sym #1: op. 42” - select from “:”), (b) allow for capital “p”, (c) making the punctuation at the end optional, because some pieces have formats like Sym #9 (C maj) Op. 98 “The Great”
s/[:punct:]+[Oo][Pp]([. ]|us )[^0-9]*[0-9]+[:punct:]+?//'")

thank you all so much!

FWIW, if you distribute it as an applet that’s not an issue, in that you can include either Satimage.osax or ASObjC Runner.app in the application bundle.

Hey Eric,

Not directly relevant. One or more lines will be edited by sed, Satimage.osax, or ASObjC_Runner by default. I used multiple lines to demonstrate how the regex worked on all of your example titles.

Feed any of the utilities one line, and you’ll get a result. Feed them more than one line, and you’ll get a result - although with sed you need to be sure the EOLs are linefeeds not returns.

Yes.

try
    set _file to (path to desktop as text) & "iTunes_Track_Names.txt"
    set AppleScript's text item delimiters to linefeed
    tell application "iTunes"
        tell playlist "Music"
            set trackNameList to name of (every track whose name starts with "Sym")
        end tell
    end tell
    if trackNameList ≠ {} then
        set trackNameList to trackNameList as text
        set fRef to open for access _file with write permission
        write trackNameList to fRef as «class utf8»
        close access fRef
        tell application "TextEdit"
            activate
            open _file
        end tell
        tell application "Terminal"
            activate # Terminal stays in the background on my machine.
            activate # Force Terminal to come to the front.
            delay 0.25
            if busy of front window = false then
                do script "sed -En '/Op/p' " & quoted form of POSIX path of _file in selected tab of front window
            end if
        end tell
    end if
on error e number n
    try
        close access fRef
    end try
    set e to e & return & return & "Num: " & n
    tell me to set dDlg to display dialog e with title "ERROR!" buttons {"Cancel", "Copy", "OK"} default button "OK"
    if button returned of dDlg = "Copy" then set the clipboard to e
end try

You still have to pair the titles with the unique id of the tracks, but that shouldn’t be hard to do.

As Shane mentioned, you can distribute the Satimage.osax or ASObjC_Runner as part of your script bundle. There’s nothing wrong with using OSX’s built-in tools, but they’re not as flexible unless you jump out to Perl, Python, Ruby, etc.

Sed is not OSX-specific. The version that comes with OSX is BSD, has a POSIX.2 standard syntax, and is quite old (2005). No matter which regex analyzer you use on OSX it’s unlikely to have an option specifically for that flavor of sed. That said (pun intended) I think RegexMatch is one of the nicer such utilities for the Mac. ($4.99 US on the App-Store)

RegexMatch will help you to visually build the regex, BUT you still need to test with sed.

If you don’t have BBEdit you should download its free counterpart TextWrangler. It has full PCRE complient regular expressions, and you can run sed tests from it rather than using the Terminal if so desired.

I have BBEdit running 24/7, so I usually start a regex in it and only switch to RegexMatch if I’m having trouble (or if it’s a really complex pattern).

That’s for you to test. :slight_smile:

My script above grabs all the track names starting with ‘Sym’, writes them to a file and then uses the Terminal to run sed-find on ‘Op’ - just to demonstrate sed in the Terminal environment.

Here’s how you might run sed from TextWrangler:

[code]#! /usr/bin/env bash

ASCR=$(osascript -e "
set _file to (path to desktop as text) & "iTunes_Track_Names.txt"
set AppleScript’s text item delimiters to linefeed
tell application "iTunes"
set trackNameList to name of every track whose name starts with "r"
trackNameList as text
end tell
");

sed -E ‘s!Rock!¢¢¢¢!’ <<< “$ASCR”;[/code]
Look for ‘Run’ in the #! menu. I have it bound Cmd-R to it for convenience.

It’s easier to compose regex initially when you don’t have to fool with escaping backslashes, and the editing environment of TextWrangler is a bit more friendly than the Terminal.

I suggest you focus on running your script on the selection in iTunes, until you sort out all reasonable kinks. Start with 1 item selected.

Use a script-runner like FastScripts, or Keyboard Maestro to assign the script a keyboard shortcut. (FS is free if you can live with only 15 keyboard shortcuts.)

If you’re going to release your script out into the wild a lot of folks will want to run on per item basis or perhaps on the selection but will feel goosie about running a mass update. You will also be responsible if you seriously screw-up their music library, so you may want to think about a backup scheme (see last script for ideas).

I think maybe you already have a handle on working with the selection, but here’s a hint anyway:

set nameList to {}
tell application "iTunes"
    set _sel to selection
    if _sel ≠ {} then
        repeat with i in _sel
            set end of nameList to name of i
        end repeat
    end if
end tell
nameList

Okay; here’s another hint for working with things more globally:

set nameCheck to {}
tell application "iTunes"
    tell selection
        set {nameList, idList} to {name, persistent ID}
    end tell
    repeat with i in idList
        tell item 1 of (tracks whose persistent ID is (contents of i))
            set end of nameCheck to its name
        end tell
    end repeat
end tell
{nameList, nameCheck}

Again - I recommend you run your script on a text file of all your relevant track names before even starting to rename actual files in iTunes. Examine the output for anomalies and adjust your pattern as needed.

I would also do a backup before running any mass update. Perhaps something like this:

set _consolidate to {}
tell application "iTunes"
    tell tracks of playlist "Music"
        set {nameList, idList} to {name, persistent ID}
    end tell
end tell
repeat with i from 1 to length of nameList
    set end of _consolidate to (item i of nameList) & tab & "¢" & tab & (item i of idList)
end repeat
set AppleScript's text item delimiters to linefeed
set _consolidate to _consolidate as text
set _consolidate to do shell script "column -t -s\"    ¢    \" <<< " & quoted form of _consolidate
tell application "TextEdit"
    activate
    set newDoc to make new document
    tell newDoc
        make new paragraph at end of its text with data _consolidate
        set font to "Menlo"
        set size to 12
    end tell
end tell

Hi professore.

Right on all counts. :slight_smile:

The last part of the regex assumes that any excerpt information in the line will be preceded by a colon as in your examples. So ending with “any number of characters not a colon” makes the regex stop when it reaches a colon (or the end of the line).

However, I’m afraid I goofed with the “+” at the end as it requires there to be at least one character between the first digit of the opus number and the colon. It should of course have been a “*” and I’ve now corrected it in my post above.

I think Chris has answered your question about prototyping. My own view is that the best way to find out if something works in a script is to write a script explicitly to test the problem you’re trying to solve ” here deleting text with certain defined characteristics from a line ” and modify it until it works. On-line simulators and/or third-party text editors are just additional encumberances.

I didn’t immediately understand your modification criteria. I’ll have another look later this morning.

I think the pattern needs to handle any possible letter case for ‘opus’. Yes?

One of the disappointing things about the sed that comes with OSX is its lack of a case-insensitive switch. This very quickly led me to install MacPorts and install gnu sed.

Well no. It wouldn’t work because [:punct:] should be [[:punct:]]. (The square brackets next to the colons are part of that type of expression for a particular kind of class content. You also need the square brackets for the class itself.) Also, [[:punct:]] would lose you the parenthesis at the end of “Sym #9 (C maj)”.

This works for every possibility of which you’ve told us so far or which I’ve been able to foresee:


set _text to "Sym #4 in E Min, Op. 8: 3. - Allegro Giocoso
Sym #1 in C. minor, Op. 11: Allegro Con Fuoco
Sym #1 in D Maj op. #22 - (\"Titan\"): 2b. Trio: Recht Gemächlich
Sym #2 in D Maj, Op. no. 73a: Allegro non Troppo    
Sym #5 (C min) OPUS 67
Sym #1: op. 42
Sym #9 (C maj) Op. 98 \"The Great\"
Sym #10247 Op. 444 No.5 \"The Aardvark\": 3. The Earth Laid Bare By Ants"

set newTitles to (do shell script ("<<<" & quoted form of _text & " sed -E 's/[:;, ]+[Oo][Pp]([. ]|[Uu][Ss] )[^0-9]*[0-9][^: ]*([, ]+(No. ?|#)[0-9]+)?//'"))

--> "Sym #4 in E Min: 3. - Allegro Giocoso
Sym #1 in C. minor: Allegro Con Fuoco
Sym #1 in D Maj - (\"Titan\"): 2b. Trio: Recht Gemächlich
Sym #2 in D Maj: Allegro non Troppo    
Sym #5 (C min)
Sym #1
Sym #9 (C maj) \"The Great\"
Sym #10247 \"The Aardvark\": 3. The Earth Laid Bare By Ants"

What’s about

Sym #39 in Eb Maj KV 543
or
Christmas Oratorio BWV 248

? :wink:

You are collectively an incredible resource. Thank you once again.

Quick thoughts:

In spite of my regex/sed/grep/general unix newbie status, I am pretty good (not great) with AS, so yes, I have a good idea of how to handle selections/changes, etc. The idea of writing everything to a file, then operating on the file, then rematching is interesting. I haven’t experimented with that before, and I don’t know if I would ever trust something to run in bulk in that fashion on my own library. Interesting to think about though.

  1. I downloaded and installed Satimage, but I need to spend some time figuring out how to get it to work. Installer says that it is installed, but I cannot figure out how to call it from within ScriptDebugger, at least.

  2. Chris: thank you for your very detailed reply. I have TextWrangler, but I have never used it for programming, because I have been using ScriptDebugger for AS and Dreamweaver for Javascript, and at the moment I don’t code in anything else. I will try TW.

  3. Nigel, got everything. Thanks. Also, I cannot wait to hear The Aardvark: Earth Laid Bare by Ants symphony! Awesome.

  4. Stefan: This script is just one of a number of regex find/replace strings that I plan on running on my classical library. As far as I know, no track by Mozart, Bach, or others who have numbering systems named for acolytes also include variants of “opus”, and because K, BWV, etc. are unique to given composers, there shouldn’t be any problem running the regex sequentially. I cannot see any real advantage in trying to handle all of the different modifications simultaneously, since each can operate independently on the output of the previous ones.

Points on releasing into the wild all well-taken. I have thus far resisted releasing any of my code (some of it very useful) because I am worried it isn’t polished enough.

Warmly,

Hey Eric,

I’ll get back to you with more when I have a little time.

For now - after installing the Satimage.osax you must at the very least restart Script Debugger for it to register. This is true whenever you install an osa component.

At worst you might need to reboot your system.

Here are the basics of using find text and change from the Satimage.osax:

set _text to "Now is the time for all good men to come to the aid of their country."

# Find with regex:
set _found to find text "the\\w*" in _text with regexp, all occurrences and string result without case sensitive

# Find with regex and capture:
set _found to find text "Now.*?the *(.+)" in _text using "\\1" with regexp, all occurrences and string result without case sensitive

# Change
set _cng to change "n" into "¢" in _text with regexp without case sensitive

I’m pleased to hear you’re using Script Debugger. What version if I may ask?

Remember that you can open the dictionaries for Scripting Additions with from the File menu and using the Cmd-Shift-A keyboard shortcut. From there you can see all the goodies the SIO adds to Applescript.

Glad too that you’ve grabbed TextWrangler. It’s very scriptable. BBEdit is quite a lot more powerful, but TextWrangler is no slouch.

Chris:

Thanks once again.

The restart of ScriptDebugger did the trick (though not before I had checked both library directories in terminal to ensure that the satimage was installed in one of the scripting additions folders).

I am using SD 5.

I will probably try TW, and possibly buy BBEdit. I suppose there should be a different thread for asking about the relative merits of BBEdit, ScriptDebugger, TW etc…

Also, probably a new thread as well, but now satimage extensions work, but throw an exception if they are inside a “tell iTunes” block… when in their own subroutine, no problems. Annoying.

Warmly,

Excellent - as expected.

Good for you.

This should not be so. Did you restart iTunes after installing the Satimage.osax?

tell application "iTunes"
	set nList to name of selection
	set _cng to change "^" into "¢" in nList with regexp
end tell

Find text by design will throw an error when it doesn’t find anything.

set _text to "Now is the time for all good men to come to the aid of their country."

set foundText to find text "Now[[:blank:]]*(.+?the)" in _text using "\\1" with regexp and string result
--> "is the"

# set foundText to find text "aNow[[:blank:]]*(.+?the)" in _text using "\\1" with regexp and string result
--> Error: No result was returned from some part of this expression.

set foundText to fnd("aNow[[:blank:]]*(.+?the)", _text, false, false, true) of me
--> false (or {} if all occurrences is set to true)
--> If found then string-result or a record depending upon the value of strRslt


---------------------------------------------------------------------------------------
# My basic find handler:
---------------------------------------------------------------------------------------
on fnd(_find, _data, _case, _all, strRslt) # Last 3 are all bool
	try
		find text _find in _data case sensitive _case all occurrences _all string result strRslt with regexp
	on error
		return false
	end try
end fnd
---------------------------------------------------------------------------------------

If the above doesn’t take care of your exception in an iTunes tell-block then please send me an example.