Attempting to index occurences of all paragraphs in a text

I’ve got quite the predicament at work, and honestly I’m a bit surprised there isn’t a dedicated program for this. I have a text file that contains about 25,000 email addresses (sorted alphabetically) and I need to find out two things 1) How many unique addresses exist and what they are 2) how many times each address occurs.

I’m not too bad with Applescript, and I even thought about using GREP for this purpose, problem is that they are email addresses and I’d have to write a script to add a bunch of slashes in there to make them unix kosher. I’m hoping that one of the geniuses on here can help me figure out a way to index these files and figure out just how many emails each person was involved in.

My idea was to cat the file in and go the brute force method and - pummel through all 25,000, 25,000 times, but I don’t think that’s very effective, especially with how slow Applescript is.

Thank you!

PS. If you know of a program that does this for OSX (or even Windows) that would be a HUGE help, and maybe save us all some time.

Model: Macbook Pro 17"
Browser: Firefox 3.5.2
Operating System: Mac OS X (10.5)

Here is a Ruby way to handle this.

Here is the result on 27,000 emails in a return separated file.

Wow, much simpler than what I was working on with Applescript!

Didn’t work for me :frowning:

and I used the following command line

I’ve never played much with Ruby so I’m not really sure

Are all the emails on that one line?
What are the line endings in the emails file?

Standard line breaks, ie.

What program is creating the file? I am pretty sure this is a line ending issue.

Try changing the “\n” to “\r” or “\n\r” and see if any of those work.

It worked, You are a genius! Thank you so much (it ended up being \r)