InDesign CS2, XML: Returning the contents of an XML item.


I’m updating a script that imports an XML document into an InDesign CS2 document, matches tags to styles, etc. so that I can pick out the content of particular tags and do with the content whatever I want (place the content specific text boxes, etc.).

The problem that I’m running into is that when I try to call the “contents” property for an XML item (or element), I get returned all of the properties for the item and not the actual text of the tag, as I can see it in Script Debugger and as is specified in the dictionary. I can get the text either through the characters or lines properties, but then that’s much slower because I have to get rid of the tags that are sprinkled throughout. Does anyone know a way to get to the text of the contents property? It may not necessarily be a bug, but it’s also not the behavior expected based on the dictionary. Thanks!

Here’s the code I’m currently playing around with. What this is specifically doing is walking down the hierarchy of a predetermined DTD and attempting to return the contents:

global documentObject

tell application "Adobe InDesign CS2"
	set documentObject to object reference of document 1
	set documentObjectElement to associated XML element of documentObject
	set StoryElement to GetXMLElement(documentObjectElement, "story") of me
	if StoryElement is null then return
	set StoryElementContents to contents of StoryElement
	--> Returns the properties of the XML element and not the actual text of the tag
	--This gets me the contents of the tag, but it's slow even on small text. 
	--I would prefer to just get the content by getting the property rather than doing this.
	set chrCount to (get count of characters in StoryElement)
	set StoryElementContents to ""
	repeat with c from 1 to chrCount
		set thisChr to character c of BookElement as string
		if thisChr is not "<FEFF>" then set BookElementContents to BookElementContents & thisChr
	end repeat
	return StoryElementContents
end tell

return BookElementContents

on GetXMLElement(ParentElement, ChildElementName)
	tell application "Adobe InDesign CS2"
		tell documentObject
			set ParentElementObject to object reference of ParentElement
			tell ParentElementObject
				set ElementCount to (get count of XML Items)
				repeat with x from 1 to ElementCount
					set ChildElement to XML item x
					set ChildElement to object reference of ChildElement
					if class of ChildElement is not DTD then
						set ChildElementMarkupTag to markup tag of ChildElement
						if name of ChildElementMarkupTag is ChildElementName then
							return ChildElement
						end if
					end if
				end repeat
			end tell
		end tell
	end tell
	return null
end GetXMLElement

I’ve dug around on a number of sites, and I’m surprised by the complete lack of information out there (especially from Adobe).


Philip Regan

Just wanted to post that I resolved this. It was an obvious solution but not at the same time. The command…

set StoryElementContents to contents of StoryElement

…needed to be changed to…

set StoryElementContents to contents of contents of StoryElement

and that returns all of the text. However, all of the nested tags get returned as Euro currency symbols, but I can manage that on my own.