Parsing JSON files

This is an old post that I’d like to comment on.

The following compound command transforms a JSON string into an equivalent Applescript value by taking advantage of the remarkable similarity of the Applescript and JSON value specifications. The primary differences between the two seem to be JSON’s use of double-quoted text strings for record labels (called object keys in JSON) and JSON’s allowance of a single value to span multiple lines. JSON’s use of square brackets to enclose lists (called arrays in JSON) does not require transformation, since square brackets are an acceptable alternative to curly braces for enclosing Applescript lists. The sed command converts double-quoted object keys to piped record labels (which “automatically” become unpiped if not needed), and the tr command strips any linefeed or carriage return characters:

set applescriptValue to run script (do shell script ("echo " & jsonString's quoted form & " | sed -E 's/\"([^\"]+)\"[[:space:]]*:[[:space:]]*/|\\1|:/g' | tr -d '\\n\\r'"))

Using an example taken from the web:


set jsonString to "{ 
    \"MenuID\":5, 
    \"MenuVersion\":1, 
    \"MenuName\":\"Lunch Menu\", 
    \"MenuItems\":[ 
       { 
            \"Name\":\"TUSCANI MEDITERRANEAN CON POLLO\", 
            \"Description\":\"Pasta\", 
            \"PKID\":2, 
            \"ParentID\":1, 
            \"Ingredients\":[ 
               { 
                    \"PKID\":123, 
                    \"IngName\":\"Cheese\", 
                    \"Included\":true, 
                    \"ExtraPrice\":0
               }, 
               { 
                    \"PKID\":124, 
                    \"IngName\":\"Sausage\", 
                    \"Included\":false, 
                    \"ExtraPrice\":0.99
               } 
           ], 
            \"ItemPricing\":[ 
               { 
                    \"PKID\":456, 
                    \"SizeName\":\"Large\", 
                    \"SizePrice\":12.99
               }, 
               { 
                    \"PKID\":678, 
                    \"SizeName\":\"Small\", 
                    \"SizePrice\":14.99
               } 
           ]
       } 
   ]
}"

set applescriptValue to run script (do shell script ("echo " & jsonString's quoted form & " | sed -E 's/\"([^\"]+)\"[[:space:]]*:[[:space:]]*/|\\1|:/g' | tr -d '\\n\\r'"))
-->
{MenuID:5, MenuVersion:1, MenuName:"Lunch Menu", MenuItems:{{|name|:"TUSCANI MEDITERRANEAN CON POLLO", |description|:"Pasta", PKID:2, ParentID:1, Ingredients:{{PKID:123, IngName:"Cheese", Included:true, ExtraPrice:0}, {PKID:124, IngName:"Sausage", Included:false, ExtraPrice:0.99}}, ItemPricing:{{PKID:456, SizeName:"Large", SizePrice:12.99}, {PKID:678, SizeName:"Small", SizePrice:14.99}}}}}

It has successfully decoded a number of JSON strings in my testing thus far. Are there any flaws with this straightforward decoding technique? (Two that come to mind are: (1) if a JSON text string value contains a pattern mimicking an object key, i.e., a double-quoted string following by a colon character, presumably a fairly rare occurrence in real-life usage; and (2) if a JSON text string value contains a linefeed or carriage return character. If one knew that neither of those patterns would be encountered, then I believe the above command would work well. Alternatively, both of those problems could be resolved by tokenizing text string values before running the above command, then detokenizing the result.)

1 Like

If there aren’t, it’s a rare beast. You might care to read this: http://seriot.ch/parsing_json.php

Stuff like parsing JSON and XML is consigned to code libraries for good reason. It can be fun trying to write you’re own, but all you’re doing is writing bugs waiting to happen.

If there were no alternative, I’d say go for it. But we have several, so IMO this is a really bad idea.

I did. A big problem seems to be the fuzzy nature of the JSON standard and the resulting large number of edge cases that can break a parser. I’ve taken to heart your recommendation to stick with code libraries (excepting perhaps where you know your JSON source well and are confident that it will remain simple and well-behaved.)

I’m still not sure why you would bother. I mean, the combination of run script and do shell script isn’t exactly fast – a simple test here shows it takes > 100 times as long as using NSJSONSerialization.

You are of course right. But I guess I still get a thrill out of one-line solutions…sometimes they’re just plain fun.

Don’t we all :slight_smile:

set theResult to (current application's NSJSONSerialization's JSONObjectWithData:((current application's NSString's stringWithString:jsonString)'s dataUsingEncoding:(current application's NSUTF8StringEncoding)) options:0 |error|:(missing value)) as record

1 Like

Touché :slight_smile:

P.S. My run script…do shell script… solution takes only 30-40, not >100, times as long as your AppleScriptObjC solution here!! :lol:

I use jq to parse weather information in json format from Weather Underground.

https://stedolan.github.io/jq/

For example:

set MoonPhase to do shell script "cat " & DirectoryPad & "/AstroDatabase.json | /usr/local/bin/jq -r .moon_phase.phaseofMoon"

When using JavaScript syntax you can write an entire program in a single line. I think the biggest thrill is using less as possible resources and the least instructions.

[format]function run(argv){
// get current application
var app = Application.currentApplication();
// load specific osax
var StandardAddition = (app.includeStandardAdditions = true, app);
// define string to display
var str = “Hello World!”;
// show the dialog
StandardAddition.displayDialog(str);
};[/format]

Is the same as:

[format]function run(argv){var app = Application.currentApplication();var StandardAddition = (app.includeStandardAdditions = true, app);var str = “Hello World!”;StandardAddition.displayDialog(str);};[/format]

Of course quite irrelevant in most situations. JSON data is downloaded with the required time and only once. It is changed, and evaluated and the the processing starts.

When you have an output from a shell command for example.

In my opinion it’s using less as possible resources and the most efficient code (regardless of the number of instructions or the number of lines).

What really got my juices flowing in this particular problem is the remarkable similarity between JSON and Applescript value specifications. It seemed almost irresistible simply to do a sed…tr… transform on a well-formed JSON string and run script it into an Applescript object.

Besides the hit in execution speed (which may or may not be relevant in a given usage scenario), the key term is well-formed. If not, the run script…do shell script… approach might well fail.

That is the least instructions :slight_smile: Of course with AS you don’t know what your instructions are because they’re bundled in something we know as commands.

They used to be (and I think still are) technically different in AppleScript, but it probably won’t affect the usability of the resulting lists.

It could, of course, all be done in the sed code: :slight_smile:

set applescriptValue to run script (do shell script ("echo " & jsonString's quoted form & " | sed -En 's/\"([^\"]+)\"[[:space:]]*:/|\\1|:/g; H; $ {g; s/[[:cntrl:]]+//g; p;}'"))

Shane Stanley wrote:

Then I better have a look at my weather program again, I use the “do shell script” with the JQ parser quite a lot.

I almost forgot that before the JQ parser I used something else to parse in AppleScript JSON information. This is the program JSON Helper, with this program you can parse JSON into regular AppleScript lists and records.
http://www.mousedown.net/mouseware/JSONHelper.html

I switched to the jq parser, because for me as a AppleScript beginner this was much easier. I changed some of my jq code to get the same information with JSON Helper.


tell application "JSON Helper"
	set AstroDataBase to fetch JSON from "https://api.wunderground.com/api/My_Key/astronomy/q/NL/Amsterdam.json"
end tell

set AgeOfMoon to ((AgeOfMoon of moon_phase) in AstroDataBase)


The [ url ] and [ /url ] are not part of my code.

This of course is also not what treed needs, but maybe somebody else finds it useful.

I think you’re missing my point. I don’t see how the source of the output makes any difference to the argument that pre-existing libraries are to be preferred over rolling your own code for things like JSON parsing.

It may affect your choice of library – you might prefer to pipe it to python, for example – but that’s a different issue.

Nice sed construct!

A little off topic, but my sense is that sed has one of the higher ratios of [hidden power]/[real-life usage] among tools in a coder’s toolbox. Fully tapping into that power, though, takes a great deal of study and experience.

I must be jaded. I get the biggest thrill from using code I know someone better than me has already debugged.

Hello, two questions:

– is JSON able to encode Unicode characters ?
– is SED able to apply to Unicode characters ?

Yvan KOENIG running Sierra 10.12.6 in French (VALLAURIS, France) lundi 11 septembre 2017 16:58:13

Hi Yvan.

It seems so. (See the script below.)

sed on Mac OS can handle Unicode as sequences of bytes, but doesn’t recognise Unicode characters per se. For instance, you can’t use the command y/Д/x/ because sed sees “Д” and “x” as different numbers of characters. But the script below works:

use AppleScript version "2.4" -- Yosemite (10.10) or later
use framework "Foundation"
use scripting additions

set |⌘| to current application

set unicodeText to "⌘řůД⦿"

set originalRecord to {aString:unicodeText, anArray:{unicodeText}}
set jsonData to |⌘|'s class "NSJSONSerialization"'s dataWithJSONObject:(originalRecord) options:(|⌘|'s NSJSONWritingPrettyPrinted) |error|:(missing value)
set jsonString to (|⌘|'s class "NSString"'s alloc()'s initWithData:(jsonData) encoding:(|⌘|'s NSUTF8StringEncoding)) as text

set reconstitutedRecord to run script (do shell script ("echo " & jsonString's quoted form & " | sed -En 's/\"([^\"]+)\"[[:space:]]*:/|\\1|:/g; H; $ {g; s/[[:cntrl:]]+//g; p;}'"))

{jsonString, reconstitutedRecord}
(* -->
{"{
  \"aString\" : \"⌘řůД⦿\",
  \"anArray\" : [
    \"⌘řůД⦿\"
  ]
}", {aString:"⌘řůД⦿", anArray:{"⌘řůД⦿"}}}
*)

Thanks a lot Nigel.

Yvan KOENIG running Sierra 10.12.6 in French (VALLAURIS, France) lundi 11 septembre 2017 19:42:43