AppleScript Keywords

Anyone know where I can find a decent list of keywords of AppleScript for a syntax highlighter?

EDIT: typo.

It’s hardly worth it since application- and osax-defined keywords get injected all over the place, making a trivial solution unreliable and a reliable solution non-trivial. The standard approach is to compile the script, then ask the AppleScript component for pretty printed source code and display that. There’s various ways to go about that, but you’d need to supply more information as to what it’s for.

But wouldn’t it be great to have an app that when run on your machine gave you all of AppleScript’s key words and the resident OSAX key words. I agree that there’s no pretty way to do that for your scriptable applications and particularly for separating their unique internal words from the set they simply “understand” that function elsewhere. As a freshman (I now consider myself a late sophmore) I really could have used that list.

It’s for a nopaste site with syntax highlighting a friend is writing (paste.insanityflows.net).

I notice that this forum has something similar with the [code]


[/code]
tags. I just need a list for the base language so we don’t have to use something like Visual Basic’s syntax highlighting.

Thanks for the replies.

Getting a list of AS-defined keywords should be pretty easy: grab its aeut resource and extract the bulk of them them from that, then add in maybe a dozen or so language keywords not in the aeut but which you can find in the AS documentation easily enough (‘if’, ‘repeat’, etc.). Getting a list of osax-defined keywords is even more straightforward: search each of the ScriptingAdditions folders for osaxen and extract all the keywords from their aete resources. You could do it in a few dozen lines of Python code, for example; all the libraries you need for fetching and parsing terminology are already there.

What use you’d have for them is another question. It’s a long list, and won’t really help you in understanding the language. A lot of the keywords AS defines don’t actually do anything; some are just for displaying in dictionaries, others are default terms provided by AS for applications too lazy to define their own. But perhaps I’m missing something? If you’ve a cool use in mind, do tell so I can shamelessly steal and commericialise it for myself. :slight_smile:

Again, extracting lists of application-defined keywords is the easy part. Writing a context-sensitive parser to drive a pretty printer is the hard part; the sort of lightweight regex- or yacc/lex-based scanners that work for other languages with simpler context-free grammars aren’t sufficient for making sense of AS code. The simplest solution is to do what Script Editor et-al do: wait till the source code is compiled, then ask AS for a pretty-printed version and display that. Of course, you’ll need all the applications and osaxen used by the script to be available at the time otherwise the source code won’t compile at all, but a DIY pretty-printer would need them all too if it’s to work correctly.

HTH

has

Yeah. The styling is neither remotely complete nor accurate though, so just how useful it is in practice is debatable.

Your best solution, IMO, would be to get folk to submit AS source code that’s already marked up. There’s various ways of doing that. It’s pretty trivial to write an AppleScript that opens a script in Script Editor, compiles it, then extracts a list of style runs and formats that with your markup of choice. Search ScriptBuilders - you’ll probably find some existing examples there that you can tailor to your needs.

HTH

has

I’m too lazy to use python for regex :stuck_out_tongue:

We already got a decent syntax highlighting state machine BSD licensed from another site, so it’s not that big of a deal in terms of adding the language. (Displaying properly may be another story)

theres probably a cleaner, easier way, but this one works.

Perl script to parse scripting defs and get commands:

[code]#!/usr/bin/perl -w

perl ASParse.pl file.sdef result.txt

initialize

$sdef_path = $ARGV[0];
$out_path = $ARGV[1];

@commands = ();
@result = ();

read file

open(DEFS,“$sdef_path”);
@the_defs=;
close(DEFS);

parse the commands, I dunno whats needed and whats not…

foreach $_(@the_defs) {
if (m/.(?:enumerator name|responds-to name|element type|class name|parameter name|command name)="([^"]).*/i) {
push(@commands,$1);
}
}

remove duplicates

undef %saw;
@saw{@commands} = ();
@result = sort keys %saw;

write to file

foreach $_(@result) {
echo \'$_\' >> $out_path;
}[/code]
enjoy

Parsing it might not be too awful. Here’s a little sample of the main action (a 195K file) found in
/System/Library/Frameworks/AppleScriptKit.framework/Versions/A/Resources

[code]<?xml version="1.0" encoding="UTF-8"?>

[/code]

Like I say, the problem isn’t getting the terminology, it’s figuring out where to apply it. AS syntax is notoriously difficult to parse correctly. Dealing with context-sensitive grammars is teh suck. :frowning:

Honest, I’ve already looked into the issue myself, trying to figure out practical ways to parse AppleScript source code without busting a gut, and the only way to achieve it that’s both easy to implement and reliable is to get AS itself to do it for you, on a machine that has all the required apps and osaxen installed (i.e. generally the one the script was written on).

If you’re using a regular context-free parser, which I assume is the case, you could try picking a few obvious keywords - if, repeat, then, on, end, etc. - and styling those, strings, numbers and comments, and see how that looks. The keyword styles won’t be 100% accurate, but strings, numbers and comments shouldn’t be any problem.

Accurately flagging keywords and identifiers yourself will require something much more advanced, however, in which case you should just use AppleScript itself because it already knows how to do it. Also, unless you have some practial way to obtain all the terminology for all the apps and osaxen used in each script (which I seriously doubt), then you’ll have to ask submitters to apply that formatting themselves, since they’ll have those apps and osaxen installed already. Trust me, it’s the easiest solution by far, and ASers won’t mind because they already appreciate the issues involved.

has

I just want to highlight keywords and comments to give a little depth to it. I’m not making anything uber-complex like an AppleScript Script Editor.

Please understand, what you ask for may sound easy to you, but it’s damnably hard to do in practice. C/PHP/Perl/etc. syntax is completely stable; it might look complex but it’s completely predictable and therefore trivial to machine parse. AS syntax is totally unstable; it might look simple but it’s an unmitigated horror to machine parse, even superficially.

As I say, you can’t reliably highlight keywords without using the AppleScript parser itself and having all the required applications and osaxen available at the time. When you do it’s extremely simple; one or two dozen lines of code at most. When you don’t it’s next to impossible. I still strongly recommend you give submitters a simple drag-n-drop AppleScript that will format the code before they send it to you; you’ll get lovely looking results for next to no effort.

However, if you really insist on sticking AS code through a context-free parser, I’d suggest you restrict yourself to styling only control statement, handle and script object declarations, as they’ll be the easiest to pick out. More than that, and it gets next to impossible to correctly detect individual keywords due to all the arbitrary injection and overloading of tokens by apps and osaxen and not knowing which tokens are complete keywords and which are only part of larger multi-word keywords. e.g. Indiscriminately flag words like ‘on’, ‘to’, ‘as’, etc. and half the time you won’t be highlighting complete language keywords but random parts of application/osax-defined keywords, and your pretty-printed code will end up being harder, not easier, to read because half the styling is completely misleading.

Anyway, my recommended selection of what to hit is below. Doesn’t cover everything, of course, but I reckon it’s the most that you’ll get a decent accuracy rate on. And it should cover all the flow control structures and variable, handler and script object declarations, which is where highlighting will be most helpful anyway. Good luck.

Single-line comment (can appear at any point on a line, ends on linebreak):

Multi-line comment (can appear at any point on a line, can be nested):
(*
*)

String literals start and end with a single double-quote:
"

The following backslash escapes are permitted within string literals:
\ " \t \n \r

Selected block structure keywords to highlight:

The following phrases should be detected at start of line following zero or more tab indentations:

tell
if
repeat
repeat while
repeat until
repeat with
exit
try
considering
ignoring
with timeout
with transaction
using terms from
return

on
to
script
local
global
property

end
end tell
end if
end repeat
on error
end try
end considering
end ignoring
end timeout
end transaction
end using terms from
end script

The following phrases should be detected at any point in a line:

then
then tell
then if
then repeat
then repeat while
then repeat until
then repeat with
then exit
then try
then considering
then ignoring
then with timeout
then with transaction
then return

Thanks man. Yeah I realize how hard it is to parse because of the inconsistency in the language and the fact it’s openly expandable.

I appreciate your help. Using an AppleScript Editor to parse is out of the question as the site is hosted on a Linux server and it’s dynamically generated. Maybe when Intel Macs come out and osacompile or something similar might run that could allow syntax highlighting in html similar to vim. (which would be a genius thing to write btw).

Again, thanks for your help.

Reikon, are you aware of Jon’s Convert Script to Markup Code?

But you wouldn’t have the application and osaxen terminology available anyway, and that’s where you’re truly stuffed. S’why I recommended you supply users with a script that’ll style code at their end. You could even come up with a nice little Studio-based app that not only styles the code but posts it straight to your site without users having to muck about with web forms and copy-n-paste. They’d like that.

Script Debugger has an ‘export as HTML’ option, but most ASers don’t have SD so this is no use to you.

But like I say, there are already free third-party scripts that will use Script Editor scripting to convert AS source code to styled HTML, and it’s trivial to write your own if none of those quite suit you.

Or you could write your own CLI tool that does this directly without having to muck about scripting Script Editor at all: just ask an NSAppleScript instance to compile the script and return the styled source code as an NSAttributedString, then convert that to HTML markup. Probably a dozen lines of ObjC (plus a couple dozen more of C if you need your styles to be a specific format).

Anyway, all the best however you do it.

has