Has anyone experienced this issue: Files with diacriticals in their names are returning an empty string when asking for their names.
tell application "Adobe Acrobat"
set docNames to (name of documents)
end
--> {"", "", "aaa.pdf", "bbb.pdf"} instead of {"ééé.pdf", "ààà.pdf", "aaa.pdf", "bbb.pdf"}
tell application "Adobe Acrobat"
set docNames to (file alias of documents)
repeat with i from 1 to length of docNames
set item i of docNames to word -1 of ((item i of docNames) as text)
end repeat
end tell
--> {"ééé.pdf"}
Since Japanese is my native language, I had never considered the possibility of filenames containing characters with diacritical marks — so this was completely new to me.
As a workaround, it seems that the name property of the PDF window can be used reliably.
# OS langage Japanese
#Acrobat Lang setting Japanese
tell application "Adobe Acrobat"
set listDoc to (name of every PDF Window) as list
log listDoc as list
--> {"ééé.pdf", "ààà.pdf"}
end tell
Also, there appears to be some variation in return values depending on Acrobat’s language setting.
In my environment, when both the OS and Acrobat are set to Japanese, the JavaScript return values were completely unusable.
# OS langage Japanese
#Acrobat Lang setting Japanese
tell application "Adobe Acrobat"
set listDoc to (name of every document) as list
log listDoc as list
--> {"e.e.e..pdf", "a.a.a..pdf"}
end tell
tell application "Adobe Acrobat"
set strFileName to (do script "this.documentFileName;") as text
log strFileName
(*....pdf*)
set strFileFullPath to (do script "this.path;") as text
log strFileFullPath
(*/Macintosh HD/Users/USERID/Desktop/a.a.a..pdf*)
#OR
(*/Macintosh HD/Users/USERID/Desktop/e.e.e..pdf*)
end tell
This is a bug I would never have encountered through my usual workflow, so moving forward, I’m thinking of exploring alternative ways to retrieve filenames and paths.
In the end, it turned out to be an Acrobat issue, but your post was very meaningful to me. Thank you so much!
Just a heads-up for anyone using Acrobat’s JavaScript API, especially the this.addWatermarkFromText() method:
There’s a long-standing bug (going back as far as Acrobat v10 or earlier) when using non-Latin characters — particularly Japanese fonts. When passing Unicode characters to addWatermarkFromText, the 2-byte Unicode values seem to get swapped internally. That is, the high and low bytes are reversed, leading to completely incorrect characters being rendered in the watermark.
For example:
‘亜’ (U+4E9C) becomes U+9C4E = ‘鱎’
‘あ’ (U+3042) becomes U+4230 = ‘䈰’
‘A’ (U+0041) becomes U+4100 = ‘䄀’
It looks like a UTF-16 byte order issue — probably treating BE as LE or vice versa.
This bug has been around for over a decade and still hasn’t been addressed. If you’re seeing garbled characters in watermark text, especially with Japanese or other non-Latin scripts, this could be the reason.
It’s not a bug report, just sharing in case others run into similar issues.
I haven’t verified it yet, but it’s possible that the issue I’m seeing now might also be related to this same byte-swapping bug.