Searching for Social Security Numbers in email (Entourage)

OK, here’s the deal. I work at a university and our department has started to scan our users’ computers for social security numbers. I wrote Applescript wrappers for a grep-based command line scanner and that is working fine.

However, some of our users have Microsoft Entourage (2004) as their primary email. Since Entourage keeps email in a proprietary format our normal scanner just cannot detect them (Thanks Microsoft!)

Soooo, I’m writing an Applescript that will walk through the user’s Entourage folder tree and scan all the messages. The first major hurtle is that Applescript can’t (AFAIK) do pattern matching on a string. So I either need some sort of add-in or I can try to use the existing grep-based utility. Any suggestions for the former?

If I go with the latter, I need to know how I can “pipe” large amounts of text data to the terminal window. As much as I would love to simply output the message to a temporary file for scanning, this could inadvertently create an additional copy of a Social Security Number in another file (and we are trying like hell to avoid doing that).

Thanks in advance!

PS: If there is interest, I can post my original scanner code in the Code Exchange section.

The Satimage.osax Scripting Addition can do regex.

Hmmm, seems I was wrong about this. While Entourage does use a proprietary format, it only obfuscates the non-user data (to and from, headers, etc). The message body appears to be inline in plain text. So our exisiting scanner will work.

Thanks, this looks like a nice tool for working with text. :slight_smile: