help with HTML/empty line cleanup for newbie's Evernote script

hi folks,

** REVISED ** to annotate the script for clarity and clarify my help request

My first script is below, mostly working but with some bugs I am hoping for advice on. It is meant to parse any Evernote note I have taken in a meeting, looking for any paragraph I have flagged in the text as an “IDEA” i want to capture. It then turns each of those IDEA sentences into a new Evernote note, which inherits the tags and some other body text from the originating Evernote note.

Here are my outstanding problems:

  1. Fixing/parsing HTML entities: When I first wrote the script I had it pull the text from the clipboard (by copying the body of the current note before I ran the script). Now that I’m pulling the text directly it pulls in the HTML entities so for e.g. >> ends up as >> in my final note. I checked out a “Fix HTML entities for iTunes” script but couldn’t figure out how to adapt it. I can see from the boards that this is a tough problem and think the easiest way to solve it, which should work 99% of the time, would be to search for any string that begins with an ; and ends with a %. I have found a snippet that works for this purpose when I test it on its own (see bottom of this note) but I haven’t been able to make it work within the complete script.

  2. Removing an empty line: The script generates an extra note because the cleanText that gets passed to Evernote at the end of the script has an extra return in it. I’ve tried a whole bunch of different approaches to this but no dice.

  3. General syntax cleanup: I’ve been working from a couple of AppleScript books plus the online documentation and many helpful posts here, but my fundamentals are still shaky, so I have a feeling that this script is unnecessarily messy. Any suggestions on what code should be eliminated/streamlined are much appreciated!

Thanks in advance for any help.

Alexandra

global cleanText
-- set all variables to global  [not sure whether this is needed, but it made the script run]
global noteList
global theEverBody
global tempTags
global keepPhrase
global message_text
global newText
global theEverBody
global this_note
global fileText
global this_text
global listOfBadChars


-- get body text of originating Evernote [works fine]
tell application "Evernote"
	set the_notes to the selection
	set this_note to item 1 of the_notes
	set this_text to the ENML content of this_note
	set the clipboard to this_text
	get the clipboard
	end tell

-- remove markup -- this works fine but leaves a few html entities as entities, which is a problem
-- remove markup from http://www.macosxautomation.com/applescript/sbrt/sbrt-04.html
set copy_flag to true
set the clean_text to ""
repeat with this_char in this_text
	set this_char to the contents of this_char
	if this_char is "<" then
		set the copy_flag to false
	else if this_char is ">" then
		set the copy_flag to true
	else if the copy_flag is true then
		set the clean_text to the clean_text & this_char as string
	end if
end repeat

set fileText to clean_text

set keepPhrase to "IDEA"


-- this deletes any text paragraphs that don't include the word IDEA
deleteLinesFromText(fileText, keepPhrase)

--deleteLinesFromText adapted from code by hank of hamsoftengineering.com as found on http://macscripter.net/viewtopic.php?id=37830
on deleteLinesFromText(theText, keepPhrase)
	set newText to ""
	try
		-- here's how you can keep all lines of text fron fileText that contain the keepPhrase.
		-- first turn the text into a list so you can repeat over each line of text
		set textList to paragraphs of theText
		
		-- now repeat over the list and ignore lines that have the keepPhrase
		repeat with i from 1 to count of textList
			set thisLine to item i of textList
			if thisLine contains keepPhrase then
				set newText to newText & thisLine & return
			end if
		end repeat
		if newText is "" then set newText to text 1 thru -2 of newText
	on error
		set newText to theText
	end try
	
	set the message_text to newText
	set delimiter to message_text
	set str to delimiter
	removeIdeas()
	assign_Evernote_variables()
	create_new_Evernotes()
	
end deleteLinesFromText


-- this deletes the word "IDEA" from the remaining paragraphs, so I'm left with the idea but not the IDEA label
on removeIdeas()
	set ideaText to message_text
	set AppleScript's text item delimiters to "IDEA"
	set the item_list to every text item of ideaText
	set AppleScript's text item delimiters to " "
	set this_text to the item_list as string
	set AppleScript's text item delimiters to ""
	set cleanText to this_text
	return cleanText
end removeIdeas


-- this gets the metadata from the originating Evernote note so it can be attached to the new notes I'm creating
on assign_Evernote_variables()
	
	set noteList to paragraphs of cleanText
	tell application "Evernote"
		set the_notes to the selection
		set this_note to item 1 of the_notes
		set tempTitle to the title of this_note
		set tempTags to tags of this_note
		set tempLink to note link of this_note
	end tell
	
-- this concatenates the original note title with the original note link so they can later be added to the new note as its body
	set theEverBody to tempTitle & return & tempLink
end assign_Evernote_variables

-- this creates a new Evernote note with the idea paragraph as the title for the new note, the tags from the original note as the tags, and the original note title and note link as the body of the new note

on create_new_Evernotes()
	repeat with currentNote in noteList
		set NoteTitle to currentNote
		tell application "Evernote"
			create note with text theEverBody title NoteTitle tags tempTags
		end tell
	end repeat
end create_new_Evernotes

And here is the snippet that worked on its own, but not in the script



--SCRIPT
local t, x, y, z
set new_message_text to message_text
set t to new_message_text
-- set {x, y, z} to {"<style", "</style>", "<style REPLACED </style>"} -- to replace contents only
set {x, y, z} to {"&", ";", ""} -- to replace both tags and contents
set t to my replaceBlocks({x, y}, z, t)
set new_message_text to t

on replaceBlocks({x, y}, z, t)
	(*
  string x, y : block start tag, block end tag
  string z : replacing string for each block "x..y"
  string t : source text
  return string : replaced string -- [1]
  
  [1] This handler does not support nested blocks.
  (Only the inner most block will be replaced if nested.)
*)
	script o
		property tt : {}
		property uu : {}
		property rr : {}
		property astid : a reference to AppleScript's text item delimiters
		try
			set astid0 to astid's contents
			set astid's contents to {x}
			set tt to t's text items
			set end of my rr to my tt's item 1
			set astid's contents to {y}
			repeat with i from 2 to count my tt
				set uu to my tt's item i's text items
				if (count my uu) = 1 then -- y not found after x in this segment
					set end of my rr to x & my tt's item i
				else
					set end of my rr to z & my uu's rest
				end if
			end repeat
			set astid's contents to astid0
		on error errs number errn
			set astid's contents to astid0
			error "replaceBlocks(): " & errs number errn
		end try
		return my rr's item 1 & my rr's rest
	end script
	tell o to run
end replaceBlocks
--END OF SCRIPT

Its been quite a while since my last post. Looking at what you want to do seems way more simple than the code you have posted. You want

I have always said that if you don’t explain line by line your goal, we can’t give the code. “some other body text from the originating Evernote note” needs to be clarified.

You need to post the text of a sample Evernote so we can see what you are parsing. If you don’t want to post your ideas that’s fine, just sub “build a rocket from cheese” or something. Just make sure we see what you see as far as formatting. When parsing, your battle is against the formatting.

Every bit of detail you can provide on your goal and what you are looking at will get this done sooner than later.

Once I have more details I can get more specific, but it looks like your original parsing code should use AS’s text item delimiters and not if/then routines to parse. You also may end up getting the tags and text from Evernote and finding it simpler to then parse for the key words. Then you would send all to the new Evernote.

SC

Thanks for introducing me to Evernote, I might be using it.

As I suspected it is far easier to achieve the goal of getting clean text by automating the select all and copy functions.

This gets me the cleaned text without the nightmare of markup removal:

Select a note, then run this script: (must have “Enable access for assisive devices” in the “Universal Access” system preferences pane checked)

--open your note
activate application "Evernote"

tell application "Evernote"
	set theNote to selection
	open note window with item 1 of theNote
end tell


--copy the text
activate application "Evernote"
tell application "System Events"
	tell process "Evernote"
		keystroke "a" using {command down}
		keystroke "c" using {command down}
		delay 1 --give the finder time to copy; without can create errors
		set TextofNote to the clipboard
	end tell
end tell

--close your note
try
	with timeout of 1 second --Evernote had a wierd issue with timing out while closing windows. If you don't get the same error you can do away with the try and timeout code.
		tell application "Evernote" to close front window
	end timeout
end try

return TextofNote
--Now your clean text is in the variable "TextofNote"

I assume with the “clean text” you can parse for the Ideas and transfer the tags as your code was fine. Why create a handler though, its one line of code with a few variables.

If you wanted to be true to where you wrote

Then you could insert the above code into a repeat routine so that with one click the script ran through “every note”:

activate application "Evernote"
tell application "System Events"
	tell process "Evernote"
		keystroke "a" using {command down} --select every note
	end tell
end tell

tell application "Evernote" to set theNotes to selection


set Notecounter to 1 --Make an item counter for your notes

repeat with thisNote in theNotes
	tell application "Evernote"
		open note window with item (Notecounter) of theNotes
	end tell
	
	activate application "Evernote"
	
	--copy the text
	activate application "Evernote"
	tell application "System Events"
		tell process "Evernote"
			keystroke "a" using {command down}
			keystroke "c" using {command down}
			delay 1 --give the finder time to copy; without can create errors
			set TextofNote to the clipboard
		end tell
	end tell
	
	--close your note
	try
		with timeout of 1 second --Evernote had a wierd issue with timing out while closing windows. If you don't get the same error you can do away with the try and timeout code.
			tell application "Evernote" to close front window
		end timeout
	end try
	
	--THIS IS WHERE YOU ADD YOUR CODE TO PARSE FOR IDEAS FROM "TextofNote" AND GET THE TAGS. THEN ADD THE CODE TO CREATE THE NEW NOTE WITH IDEAS AND TAGS
	
	set Notecounter to Notecounter + 1 as number
	
end repeat


If Evernote is open, running the script above will show you each window open, copy text, and close. Its not as pretty as running in the background (get ENML content), but much more attractive than parsing html, in my opinion.