Quick way to list all artists in itunes?

I need to have a script that returns a list of all the artists in itunes. I’ve been using the following script to get a list of the artists, however when using this script on a library with several thousand songs (10,000 on one machine) it takes about a min to return the results.

tell application "iTunes"
	set myList to get artist of every track of library playlist 1
	my remove_duplicates(myList)
end tell

on remove_duplicates(the_list)
	set return_list to {}
	repeat with an_item in the_list
		if return_list does not contain an_item then set end of return_list to (contents of an_item)
	end repeat
	return return_list
end remove_duplicates

Is there a quicker way to get a list of all the artists in itunes (without duplicate entries)? Seems like there should be a simple way to do this but I haven’t been able to think of it yet. Thanks your help!

  • Jeffrey

Jeff:

I would be interested in knowing if this goes any faster for you:

tell application "iTunes"
	set myList to get artist of (every track of library playlist 1 whose artist is not missing value)
end tell
set art_lst_write to ((path to desktop as Unicode text) & "artists.txt")
set art_lst_final to ((path to desktop as Unicode text) & "artistsFinal.txt")
set alw to open for access file art_lst_write with write permission
repeat with artist_1 in myList
	write artist_1 & "
" to alw
end repeat
close access alw
do shell script "sort -u " & (POSIX path of art_lst_write) & " > " & (POSIX path of art_lst_final)

First, a text file is written to your desktop that contains all the artists as gathered from iTunes, then the shell routine sorts them alphabetically as well as only grabbing the unique names. A new text file is then written to your desktop of the sort. It goes pretty fast on my machine, but I don’t have a gargantuan iTunes collection handy to test it.

I look forward to your report.

Craig, you don’t need to use text files.

This script works quite well is shorter and much faster!

tell application "iTunes"
	set myList to get artist of (every track of library playlist 1 whose artist is not missing value)
end tell
set oldDelim to AppleScript's text item delimiters
set AppleScript's text item delimiters to ASCII character 10
set allArtists to myList as text
set AppleScript's text item delimiters to oldDelim
do shell script "echo " & (quoted form of allArtists) & " | sort -u "

It took 0.6 seconds for 2750 tracks of about 40 unique artists and all artist being defined.

Cool, Vincent, thanks for the improvement. I thought there must be a way to do it without the file, but I am still a ‘shell neophyte.’

Wow! That is much faster, Vincent. Thanks for your help too Craig. Being the applescript/programming ‘neophyte’ I am, is there a way to get the result to come back in an array? Thanks!

  • Jeffrey

Jeffrey:

This version will make a list of the artists for you:

tell application "iTunes"
	set myList to get artist of (every track of library playlist 1 whose artist is not missing value)
end tell
set oldDelim to AppleScript's text item delimiters
set AppleScript's text item delimiters to ASCII character 10
set allArtists to myList as text
set AppleScript's text item delimiters to oldDelim
set new_list to every paragraph of (do shell script "echo " & (quoted form of allArtists) & " | sort -u ")

You now have an alphabetized list (new_list) of all the unique artists in your iTunes library and you can do whatever you want with it.

Interestingly here, an unnamed artist (iTunes 6.0.4) returns an empty Unicode text color=gray[/color], rather than missing value. So this variation works slightly better for me:

set text item delimiters to ASCII character 10
tell application "iTunes" to tell (get artist of tracks of library playlist 1 ¬
	whose artist is not "") to set l to beginning & ({""} & rest)
set text item delimiters to {""}
paragraphs of (do shell script "echo " & quoted form of l & " | sort -u ")

Very smooth, kai, but I cannot see the purpose of this:

to set l to beginning & ({""} & rest)

And the script seems to run all right without it:

set text item delimiters to ASCII character 10
tell application "iTunes" to tell (get artist of tracks of library playlist 1 ¬
    whose artist is not "") to set l to beginning & rest
set text item delimiters to {""}
paragraphs of (do shell script "echo " & quoted form of l & " | sort -u ")

In fact, I cannot see any difference in the lists generated with or without the {“”}. What am I missing?

Briefly, Craig, it’s intended to return either plain or Unicode text - depending on the encoding of the original text. (In other words, it avoids the need for a conditional statement to ensure that any original encoding is preserved.)

I really should write up an explanation of how it works as a FAQ, to make it more readily accessible. Until I make some time to do that, it’s covered (among other things) in a rather lengthy analysis here. (Scroll down message #4 - to the section that begins: “And now for the set MessageWholeText to beginning & ({“”} & rest) part…”)

Probably a list whose first two items are named artists…

For example, no problem here:

set text item delimiters to ASCII character 10
set l to {"", "Beatles", "Crickets", "Dire Straits", "Yardbirds"}
tell l to beginning & rest

Nor here:

set text item delimiters to ASCII character 10
set l to {"Beatles", "", "Crickets", "Dire Straits", "Yardbirds"}
tell l to beginning & rest

However, this trips over the first couple of items:

set text item delimiters to ASCII character 10
set l to {"Beatles", "Crickets", "Dire Straits", "Yardbirds"}
tell l to beginning & rest

That’s basically because the current delimiter (ASCII character 10) isn’t inserted between l’s beginning and rest. Instead of explicitly adding the missing character as a further concatenation item, we can do this:

beginning & ({""} & rest)

… which (partially) evaluates as:

"Beatles" & ({""} & {"Crickets", "Dire Straits", "Yardbirds"})

… which (partially) evaluates as:

"Beatles" & {"", "Crickets", "Dire Straits", "Yardbirds"}

… which (given the linefeed character as the delimiter) coerces (through concatenation of a list with text) as:

:slight_smile:

Bear in mind that the amount of data you can pass on the command line is limited (can’t remember what the limit is these days, might be 128KB); writing to UTF8 file and feeding that to stdin is the right approach if you’ve a non-trivial or unknown amount of data to process. Also, you can’t pass strings containing ASCII 0 characters, though that’s probably not a problem in this case.

(BTW, the reason Craig’s script is slow will be that it’s looping over a large number of list items - a notorious AS performance sump unless optimised with various kludges - and writing each one to file individually, which may will also add some overhead.)

Clever trick, though note that since you’re feeding that text to ‘do shell script’ this is rather pointless in practice as it’ll all end up as Unicode eventually anyway (‘do shell script’ encodes it as UTF8 for executing on the command line and returns Unicode text when done).

Incidentally, here’s a simple Python solution (requires appscript, takes a couple seconds OMM):

#!/usr/bin/env pythonw

from appscript import *

artists = app('iTunes').library_playlists[1].tracks.artist.get()

d = {} # a fast, simple way to eliminate duplicates is to add them to a Set or a dictionary
for s in artists:
	d[s.lower()] = s # ignore case when eliminating duplicates, though preserve it in final list
l = d.values()
l.sort(lambda a, b: cmp(a.lower(), b.lower())) # sort the list for nice-looking result (optional)
print l

Thanks for the explanations, and examples, kai & hhas. I appreciate your time putting it all together, and making it clear.

Apologies for the late reply.

I’m not sure that coercing to the original class is entirely pointless. Since iTunes stores and returns text values as Unicode text, it can handle text that includes Unicode-only characters. While the shell script returns Unicode text, the point at which these characters might be lost is actually during list-to-text coercions - before the shell sort script is even entered:

«data utxt0110006F007201110065002000420061006C006101610065007600690107» as Unicode text
display dialog "String:" & return & (result as string) & return & return & "Unicode text:" & return & result

However, since AppleScript’s Unicode handling still seems to be somewhat buggy, the theory doesn’t translate consistently into practice. For example, list-to-text coercions of items containing Unicode-only characters can result in distorted or truncated results. Interestingly, there are cases in which “contextual” coercion can return more accurate results:

set artist_list to paragraphs of («data utxt0110006F007201110065002000420061006C0061016100650076006901072029004F006C0067006100200054006100F100F3006E20290042006A00F60072006B00200047007500F0006D0075006E00640073006400F30074007400690072» as Unicode text)

set tid to text item delimiters
set text item delimiters to (ASCII character 10)

set string_text to artist_list as text
set Unicode_text to artist_list as Unicode text
tell artist_list to set contextual_text to beginning & ({""} & rest)

set text item delimiters to tid

display dialog ¬
	"String:" & return & string_text & return & return & ¬
	"Unicode text:" & return & Unicode_text & return & return & ¬
	"Contextual text:" & return & contextual_text

At present though, it seems the best way to side-step anomalies with coercions involving Unicode-only characters is to avoid such coercions altogether - as in this vanilla routine:

to remove_duplicates from i
	script o
		property r : {i's beginning}
		property l : i's rest
	end script
	considering case
		repeat with i from 1 to count o's l
			tell o's l's item i to if it is not in o's r then set o's r's end to it
		end repeat
	end considering
	o's r
end remove_duplicates

to sortList(i)
	script o
		property l : i
	end script
	considering case
		repeat with i from 2 to count o's l
			set v to o's l's item i
			repeat with i from (i - 1) to 1 by -1
				tell o's l's item i to if v < it then
					set o's l's item (i + 1) to it
				else
					set o's l's item (i + 1) to v
					exit repeat
				end if
			end repeat
			if i is 1 and v < o's l's beginning then set o's l's item 1 to v
		end repeat
	end considering
end sortList

tell application "iTunes" to set artist_list to artist of tracks of library playlist 1 whose artist is not ""
set artist_list to remove_duplicates from artist_list
sortList(artist_list)
choose from list artist_list with prompt "Choose an artist:"

Obviously, if none of your favourite artists have ‘exotic’ names, then the issue shouldn’t concern you unduly. But if any names contain Unicode-only characters, then you might benefit from such an approach. :slight_smile:

Hmph. Though can’t say I’m hugely surprised. Text handling support is one of the three major reasons I mainly write in Python now; while not absolutely perfect, it’s still a million miles better than AS’s. Hope you’ve filed a Radar bug on this.

A couple quick suggestions:

  1. Don’t make the code so horribly cryptic; the following works the same but is easier to read:
tell application "iTunes" to set lst to artist of tracks of library playlist 1 whose artist is not ""
set text item delimiters to ASCII character 10
set txt to item 1 of lst & ({""} & rest of lst)
set text item delimiters to {""}
...
  1. Add a comment to the second line of the above stating that it’s a workaround for AppleScript’s buggy ‘ as Unicode text’ coercion.

One option would be to use TextCommands’ ‘join’ command, which is Unicode-savvy.

HTH

If you know where to find the “iTunes Music Library.xml” file, then you could also do something like this:

property iTML_XML : POSIX path of ((path to music folder as Unicode text) & "iTunes:iTunes Music Library.xml")

do shell script "/usr/bin/grep -o '<key>Artist</key><string>.\\+</string>' " & quoted form of iTML_XML & ¬
	" | /usr/bin/sort -u | /usr/bin/colrm 1 25 | /usr/bin/sed 's~</string>$~~g'"
set allArists to paragraphs of result

Of course (Bug ID# 4475021).

If I’d used the tell statement approach without attempting to explain it, that might have been fair comment (although, personally, I find a series of several references easier to read using this form). I’m not sure what was wrong with ‘beginning’, since it’s a perfectly valid AS reference form (not to mention a shade faster than ‘item 1’). Perhaps this kind of stuff could be put down to personal preference…


Would that it were. I tend to use it in place of a conditional statement because it’s shorter and faster - but I’m afraid it doesn’t necessarily prevent truncation of a list when Unicode-only characters are encountered:

set artist_list to paragraphs of (("James Taylor" & return & "Mary Black" & return as Unicode text) & («data utxt0110006F007201110065002000420061006C0061016100650076006901072029004F006C0067006100200054006100F100F3006E20290042006A00F60072006B00200047007500F0006D0075006E00640073006400F30074007400690072»))

set concat_text to artist_list's beginning
repeat with i in rest of artist_list
	set concat_text to concat_text & return & i
end repeat

set tid to text item delimiters
set text item delimiters to return
set string_text to artist_list as string
set Unicode_text to artist_list as Unicode text
tell artist_list to set contextual_text to beginning & ({""} & rest)
set text item delimiters to tid

display dialog ¬
	"String:" & return & string_text & return & return & ¬
	"Unicode text:" & return & Unicode_text & return & return & ¬
	"Contextual text:" & return & contextual_text & return & return & ¬
	"Concatenated text:" & return & concat_text

(All these behaviours were detailed and compared in the bug report mentioned above.)

No arguments there, hhas. Accurate, consistent and easy to use. :slight_smile:

Requires TextCommands:

tell application "TextCommands" to set artist_list to join artist_list using ASCII character 10

Hi Jeff,

I don’t know why this works, but if you pass a reference to the list to the handler, then it’s about 4 times faster.

tell application "iTunes"
	set myList to get artist of every track of library playlist 1
end tell
set list_ref to a reference to myList
--set t1 to the ticks
set r to remove_duplicates(list_ref)
--set t2 to the ticks
--display dialog (t2 - t1)
return r
--
on remove_duplicates(the_list)
	set return_list to {}
	repeat with an_item in the_list
		if return_list does not contain an_item then set end of return_list to ¬
			(contents of an_item)
	end repeat
	return return_list
end remove_duplicates

I must be getting mixed up with something else, but I always thought that you can’t pass a reference to a list to a subroutine.

gl,

If a piece of code requires separate discussion to disambiguate it, that’s usually an indicator that the code itself could be made clearer.

(I suspect you and Mr Garvey are the exceptions that prove the rule.;p)

It’s true true that ‘beginning of…’ is a valid reference form in itself; however, it shouldn’t actually do what it’s doing in this particular context.

To explain: ‘beginning of…’ and ‘end of…’ are insertion location references, not element references. An insertion location identifies a point between elements, not an element itself. I don’t know why they return an element when you use them on a list: they don’t when used on strings or application objects, and ‘get beginning of lst’ and ‘set beginning of lst to val’ are not complementary. It’s also not documented in the ASLG and contradicts the explanation of these reference forms that it does give.

Given the inconsistent and undocumented nature of this behaviour, the likely explanation is it’s a bug/shoddy design in AS. Using it to get a list item willl confuse readers who are already familiar with insertion location references and how they’re supposed to work, whereas using ‘item 1’ is both correct and self-explanatory.


Would that it were. I tend to use it in place of a conditional statement because it’s shorter and faster - but I’m afraid it doesn’t necessarily prevent truncation of a list when Unicode-only characters are encountered
[/quote]

That’s a pity. In that case probably simplest to stick with ‘allArtists as Unicode text’ and add a warning comment that it may fail on some Unicode characters. (Any speed difference will be absolutely trivial, btw, and makes no noticeable difference to the overall running time.)

HTH

Good stuff. Who knows if it’ll ever get fixed, of course, but at least it shows Apple that folks care about stuff like this.

A reference is an AppleScript value like any other: you can stick it in variables, pass it in and out of handlers, etc. You might be confusing the fact that you can’t create references to local variables, only properties, elements and global variables. e.g. This doesn’t work:

on foo(theList)
    set resultList to {}
    set theListRef to a reference to theList
    set resultListRef to a reference to resultList
    repeat with itemRef in theListRef
    ...
end foo

because ‘theList’ and ‘resultList’ are both local variables. (Your example only works because ‘myList’ is declared in the ‘run’ handler, and variables there are technically global even though they behave like locals for most purposes.[1])

So you’d generally use the script object trick instead, since that gives you object properties which you can make references to:

on foo(lst)
    script k
        property resultList : {}
        property theList : lst
    end script
    repeat with itemRef in k's theList
    ...
end foo

The speed-up trick works whether you use a literal reference to the list or use the ‘a reference to’ operator to create a reference value to it. All still completely horrible, of course, but I don’t see Apple ever bothering to fix the performance problems at source so it’s just something you gotta suck up and do when processing large lists.

HTH


[1] No, this isn’t one of the more sensible parts of AS’s design either… :stuck_out_tongue:

Hi,

Guess what? I used hhas script object suggestion and now it’s 6 times faster. I wonder why it’s faster than passing the reference.

tell application “iTunes”
set myList to get artist of every track of library playlist 1
end tell
set t1 to the ticks
set r to remove_duplicates(myList)
set t2 to the ticks
display dialog (t2 - t1)
return r

on remove_duplicates(the_list)
script S
property list_ref : the_list
end script
set return_list to {}
repeat with an_item in list_ref of S
if return_list does not contain an_item then set end of return_list to ¬
(contents of an_item)
end repeat
return return_list
end remove_duplicates

It doesn’t make sense to me. Seems like the script is doing more work.

gl,