Reading data from a GPX file

Hi,
I own a gps equipped heat rate monitor and want to extract the three fields from each report : time, elevation and heart rate. Here is the GPX file in a zip archive:
suuntoapp-Hiking-2026-01-29T09-57-12Z-track.gpx.zip (41.6 KB)

I have an Applescript which does what I want using multiple changes to the text item delimiters but is is slow taking just over three minutes to run on a 2013 MacbookPro.

I realise that the GPX format is a light weight form of XML and suspect that a dedicated XML parser could be used but I know nothing about them.

Here is my script but note when I say my script the handler that does all the hard work was written by forum user StefanK. See post https://www.macscripter.net/t/using-offset/50568/2

use AppleScript version "2.4" -- Yosemite (10.10) or later
use scripting additions


set thefile to POSIX path of (choose file with prompt "Please choose a file:" of type {"public.text"})


set fileHandle to open for access thefile
set tFile to read fileHandle
close access fileHandle
set tGPXData to text 1 thru -1 of tFile -- One very long string of <tags> and values with some spaces

######### Parse the Data creating a list of lists  #########
set tid to AppleScript's text item delimiters
set AppleScript's text item delimiters to tab
set {tid, text item delimiters} to {text item delimiters, {"<trkpt"}}
set tDataSet to {}


repeat with i from 2 to (count of text items of tGPXData)
	set tRec to text item i of tGPXData
	--set tTime to ReadRecTime(tRec)
	set tTime to GetSubText(tRec, "<time>", "</time>")
	set tTime to GetSubText(tTime, "T", "Z")
	
	set tElevation to GetSubText(tRec, "<ele>", "</ele>")
	
	set tHeartRate to GetSubText(tRec, "<gpxtpx:hr>", "</gpxtpx:hr>")
	
	
	set DataRec to {PlotTime:tTime, Elevation:tElevation, HeartRate:tHeartRate}
	set end of tDataSet to DataRec
	
	
end repeat

display dialog "Debug stop"



on GetSubText(SomeText, StartItem, enditem)
	copy SomeText to OrigText
	set tid to text item delimiters
	if StartItem is not 1 then
		set text item delimiters to StartItem
		try
			set SomeText to text items 2 thru -1 of SomeText as text
		end try
	end if
	if enditem is not -1 then
		set text item delimiters to enditem
		set SomeText to text item 1 of SomeText
	end if
	set text item delimiters to tid
	if SomeText is OrigText then return ""
	return SomeText
end GetSubText

Any thoughts including :“don’t do it that way!”
best wishes
Simon

Simon. I’ve included my suggestion below. It uses ASObjC and regular expressions to parse the data and returns a list of lists. The timing result on my M2 Mac mini was 0.6 second.

use framework "Foundation"
use scripting additions

set theFile to POSIX path of (choose file)
set theData to getData(theFile)

on getData(theFile)
	set theString to current application's NSString's stringWithContentsOfFile:theFile encoding:(current application's NSUTF8StringEncoding) |error|:(missing value)
	
	set allMatches to current application's NSMutableArray's new()
	set thePattern to "<ele>(.*?)</ele>.*?(\\d{2}:\\d{2}:\\d{2}).*?<gpxtpx:hr>(.*?)</gpxtpx:hr>"
	set theRegex to current application's NSRegularExpression's regularExpressionWithPattern:thePattern options:0 |error|:(missing value)
	set regexResults to theRegex's matchesInString:theString options:0 range:{location:0, |length|:theString's |length|()}
	repeat with aMatch in regexResults
		set theMatches to current application's NSMutableArray's new()
		(theMatches's addObject:(theString's substringWithRange:(aMatch's rangeAtIndex:1)))
		(theMatches's addObject:(theString's substringWithRange:(aMatch's rangeAtIndex:2)))
		(theMatches's addObject:(theString's substringWithRange:(aMatch's rangeAtIndex:3)))
		(allMatches's addObject:theMatches)
	end repeat
	
	return allMatches as list
end getData

FWIW, the following version returns records and took 0.63 second to run.

use framework "Foundation"
use scripting additions

set theFile to POSIX path of (choose file)
set theData to getData(theFile)

on getData(theFile)
	set theString to current application's NSString's stringWithContentsOfFile:theFile encoding:(current application's NSUTF8StringEncoding) |error|:(missing value)
	
	set allMatches to current application's NSMutableArray's new()
	set thePattern to "<ele>(.*?)</ele>.*?(\\d{2}:\\d{2}:\\d{2}).*?<gpxtpx:hr>(.*?)</gpxtpx:hr>"
	set theRegex to current application's NSRegularExpression's regularExpressionWithPattern:thePattern options:0 |error|:(missing value)
	set regexResults to theRegex's matchesInString:theString options:0 range:{location:0, |length|:theString's |length|()}
	repeat with aMatch in regexResults
		set theDictionary to (current application's NSMutableDictionary's new())
		(theDictionary's setObject:(theString's substringWithRange:(aMatch's rangeAtIndex:1)) forKey:"Elevation")
		(theDictionary's setObject:(theString's substringWithRange:(aMatch's rangeAtIndex:2)) forKey:"TheTime")
		(theDictionary's setObject:(theString's substringWithRange:(aMatch's rangeAtIndex:3)) forKey:"HeartRate")
		(allMatches's addObject:theDictionary)
	end repeat
	
	return allMatches as list
end getData

BTW, these scripts worked with the test file provided. If any of the desired data is missing in future files, these scripts may require error correction.

I cleaned up your script. it is very fast now.

use AppleScript version "2.4" -- Yosemite (10.10) or later
use scripting additions

property tGPXData : missing value
property tDataSet : {}

local cc, psteps, thefile
set thefile to (choose file with prompt "Please choose a file:" of type {"gpx"})
set my tGPXData to read thefile

######### Parse the Data creating a list of lists  #########
set tid to text item delimiters
set text item delimiters to "<trkpt"
set my tDataSet to {}
set my tGPXData to text items of my tGPXData
set cc to count my tGPXData
set progress description to "GPX data reader…"
set progress additional description to ""
set progress total steps to 100
set psteps to cc div progress total steps
repeat with i from 2 to cc
	set tRec to item i of my tGPXData
	set tTime to GetSubText(GetSubText(tRec, "<time>", "</time>"), "T", "Z")
	set tElevation to GetSubText(tRec, "<ele>", "</ele>") as real
	set tHeartRate to GetSubText(tRec, "<gpxtpx:hr>", "</gpxtpx:hr>") as integer
	
	set end of my tDataSet to {PlotTime:tTime, Elevation:tElevation, HeartRate:tHeartRate}
	if (i mod psteps) = 0 then
		set progress completed steps to progress completed steps + 1
		set progress additional description to "(" & i & " of " & cc & ")"
	end if
end repeat

display dialog "Debug stop" giving up after 1


on GetSubText(SomeText, StartItem, enditem)
	local tid, OrigText
	copy SomeText to OrigText
	set tid to text item delimiters
	if StartItem is not 1 then
		set text item delimiters to StartItem
		try
			set SomeText to text items 2 thru -1 of SomeText as text
		end try
	end if
	if enditem is not -1 then
		set text item delimiters to enditem
		set SomeText to text item 1 of SomeText
	end if
	set text item delimiters to tid
	if SomeText is OrigText then return ""
	return SomeText
end GetSubText

Your script runs on my 2011 era MBP with Sierra in under 4 seconds using Script Debugger (no Script Geek). With a newer M4 mini it is only a tenth of a second faster, so it is well within the range of how fast you can select the file and change windows.

Using NSDictionary with the location adds a couple more seconds on the MBP, and a couple more tenths of a second on the M4, although I’m not very good with regular expressions. The script takes quite a bit less time to run than starting everything up and clicking some windows.

Edit to add that @robertFern ’s script is a little over 5 times faster - the progress indicator doesn’t stand a chance.

My computer is slow. @robertfern’s script took from just over 1 second to about 2.2 seconds on my decrepit late 2013 mac/sierra. However, it does pause for a bit (perhaps as much as 2 seconds) just as the progress indicator is at its last state. This doesn’t end up in the final time. I replaced the choose and dialog lines to reduce needless time issues.

This isn’t as fast but it’s faster than my own attempt. What this does is write the data to a csv file, beginning like this:

Time,Alt,HR
09:57:28,71.0,94
09:57:29,71.0,94

This would allow it to be opened in excel or numbers (could use tab instead of comma to separate). It also took less time to run than my other script, presumably because it doesn’t construct a big list.

use scripting additions
set text item delimiters to {""}
set pdt to (path to desktop as text)
set thef to alias (pdt & "suuntoapp-Hiking-2026-01-29T09-57-12Z-track.gpx")

tell application "System Events"
	set nom to name of thef
end tell
tell application "System Events"
	set nom to name of thef
	set AppleScript's text item delimiters to "." & name extension of thef
	set base to text item 1 of nom
end tell
set tFile to read thef as «class utf8»

#########  Parse the Data creating a list   #########
set text item delimiters to {"<trkseg>", "</trkseg>"}
set tall to text item 2 of tFile -- prune
set text item delimiters to "</trkpt>"
set timr to text items 1 thru -2 of tall

######### Process list #########
set dataCSV to {"Time,Alt,HR"}
considering case
	repeat with eal in timr
		set text item delimiters to {"<trkpt lat=\"", "\" lon=\"", "<time>", "</time>", "T", "Z", "\"><ele>", "</ele>", "<gpxtpx:hr>", "</gpxtpx:hr>"}
		set tmp to text items of eal
		set {tim, ele, rat} to {item 7 of tmp, item 4 of tmp as real, item 11 of tmp as integer}
		set text item delimiters to ","
		set eachRec to {tim, ele, rat} as text
		set end of dataCSV to eachRec
	end repeat
end considering

-- write to csv file with same name
set text item delimiters to {""}
tell application "System Events"
	set con to path of (container of thef)
end tell
set newn to base & ".csv"
set cfn to con & newn

set text item delimiters to linefeed
set frn to (open for access file cfn with write permission)
try
	set eof frn to 0
	write (dataCSV as text) to frn starting at eof as «class utf8»
end try
close access frn
1 Like

Thanks for your inputs. I plan to take a close look at them all when I get home from my winter break and attempt to understand how they all work.

best wishes

Simon

I also discovered a weird thing about AppleScript’s built-in function “offset”, it is 2.5 times slower than my custom offset command I wrote using text item delimiters.

on offset2(findText, SomeText)
	local t, tid
	set tid to text item delimiters
	set text item delimiters to findText
	try
		set t to text items 1 thru 2 of SomeText
	on error
		return 0
	end try
	set text item delimiters to tid
	return (length of item 1 of t) + 1
end offset2

Simon already has too many suggestions to consider and doesn’t need another one. However, I wanted to test a different regex approach and have included it below just for general information.

As written, the script writes the data in the form of a CSV to the user’s desktop. This took 28 milliseconds on my M2 Mac mini. Alternatively, the CSV string can be made into a list of lists, and this took 152 milliseconds.

The script worked without error on the test file. Error correction may be needed if other test files differ in some significant respect (like missing data).

use framework "Foundation"
use scripting additions

set sourceFile to POSIX path of (choose file)

set theData to getData(sourceFile) --an NSString of the desired data in CSV format
writeString(theData) --disable this if desired
--set listOfLists to getListOfLists(theData) --enable this if desired

on getData(sourceFile)
	set theString to current application's NSString's stringWithContentsOfFile:sourceFile encoding:(current application's NSUTF8StringEncoding) |error|:(missing value)
	set theString to theString's stringByReplacingOccurrencesOfString:"<trkpt " withString:linefeed options:1024 range:{0, theString's |length|()} --split string into paragraphs
	set theString to theString's stringByReplacingOccurrencesOfString:"^.*\\n" withString:"" options:1024 range:{0, theString's |length|()} --remove first paragraph
	set thePattern to "(?m)^.*?<ele>(\\d+\\.\\d)</ele>.*?<time>.*(\\d{2}:\\d{2}:\\d{2}).*?</time>.*?<gpxtpx:hr>(\\d+)</gpxtpx:hr>.*?$"
	set theString to theString's stringByReplacingOccurrencesOfString:thePattern withString:"$1,$2,$3" options:1024 range:{0, theString's |length|()} --get data
	return theString
end getData

on writeString(theString)
	set theFile to (current application's NSHomeDirectory()'s stringByAppendingPathComponent:"Desktop")'s stringByAppendingPathComponent:"Data File.txt"
	(current application's NSString's stringWithString:theString)'s writeToFile:theFile atomically:true encoding:(current application's NSUTF8StringEncoding) |error|:(missing value) --existing file will be overwritten
end writeString

on getListOfLists(theData) --may need error correction for blank lines or other oddities in the data
	set theArray to (theData's componentsSeparatedByString:linefeed)
	set arrayOfArrays to current application's NSMutableArray's new()
	repeat with anItem in theArray
		set tempArray to (anItem's componentsSeparatedByString:",")
		(arrayOfArrays's addObject:tempArray)
	end repeat
	return arrayOfArrays as list
end getListOfLists

I’ve made a few changes to my script and got it a little faster without using subroutines.

use AppleScript version "2.4" -- Yosemite (10.10) or later
use scripting additions

property tGPXData : missing value
property tDataSet : {}

local cc, psteps, thefile, tRec, tTime, tElevation, tHeartRate
set thefile to (choose file with prompt "Please choose a file:" of type {"gpx"})
set my tGPXData to read thefile

######### Parse the Data creating a list of lists  #########
set tid to text item delimiters
set text item delimiters to {"<trkpt ", "</trkpt><trkpt "}
set my tGPXData to rest of (text items of my tGPXData)
set cc to count my tGPXData
set progress description to "GPX data reader…"
set progress additional description to ""
set progress total steps to 100
set psteps to cc div progress total steps

repeat with i from 1 to cc
	set tRec to item i of my tGPXData
	set text item delimiters to {"<time>", "</time>"}
	set tTime to text item 2 of tRec
	set text item delimiters to {"T", "Z"}
	set tTime to text item 2 of tTime
	set text item delimiters to {"<ele>", "</ele>"}
	set tElevation to (text item 2 of tRec) as real
	set text item delimiters to {"<gpxtpx:hr>", "</gpxtpx:hr>"}
	set tHeartRate to (text item 2 of tRec) as integer
	set end of my tDataSet to {PlotTime:tTime, Elevation:tElevation, HeartRate:tHeartRate}
	if (i mod psteps) = 0 then
		set progress completed steps to progress completed steps + 1
		set progress additional description to "(" & i & " of " & cc & ")"
	end if
end repeat
set my tGPXData to {}
set text item delimiters to tid
display alert "Debug stop" giving up after 1

A little late to the party, this tries its luck with ObjC XML methods. It returns a list of records slightly more quickly than does @peavine’s second script in post #2, but only because of the way it builds the list of records. peavine’s is faster when modified to build the list in the same way.

use AppleScript version "2.4" -- OS X 10.10 (Yosemite) or later
use framework "Foundation"
use scripting additions

set theFile to (choose file with prompt "Please choose a file:" of type {"gpx"})
parseGPX(theFile)

on parseGPX(GPXFile)
	set GPXURL to current application's class "NSURL"'s fileURLWithPath:(GPXFile's POSIX path)
	set XML to current application's class "NSXMLDocument"'s alloc()'s initWithContentsOfURL:(GPXURL) options:(0) |error|:(missing value)
	set relevantElements to ((XML's rootElement()'s elementsForName:("trk"))'s firstObject()'s ¬
		elementsForName:("trkseg"))'s firstObject()'s elementsForName:("trkpt")
	script o
		property output : {}
	end script
	repeat with thisSet in relevantElements
		set Elevation to (thisSet's elementsForName:("ele"))'s firstObject()'s objectValue() as text
		set PlotTime to ((thisSet's elementsForName:("time"))'s firstObject()'s objectValue() as text)'s text 12 thru 19
		set HeartRate to (thisSet's elementsForName:("extensions"))'s firstObject()'s objectValue() as text
		set o's output's end to {PlotTime:PlotTime, Elevation:Elevation, HeartRate:HeartRate}
	end repeat
	
	return o's output
end parseGPX
1 Like

FWIW, I tested all of the above scripts that return a list of records. I also tested a new script that I wrote. The testing was done with Script Geek on my M2 Mac mini. The ASObj scripts were tested with the Foundation framework in memory, which would normally be the case. I did a quick spot check, and the scripts all appeared to return the same data. The results were:

Simon - 7.052 seconds
Peavine (old) - 607 milliseconds
Nigel - 505 milliseconds
Peavine (new) - 223 milliseconds
Robert (newest version) - 141 milliseconds

My new script:

use framework "Foundation"
use scripting additions

set sourceFile to POSIX path of (choose file)

set theData to getData(sourceFile) --a list of records

on getData(sourceFile)
	set theString to current application's NSString's stringWithContentsOfFile:sourceFile encoding:(current application's NSUTF8StringEncoding) |error|:(missing value)
	set theString to theString's stringByReplacingOccurrencesOfString:"<trkpt " withString:linefeed options:1024 range:{0, theString's |length|()} --split string into paragraphs
	set theString to theString's stringByReplacingOccurrencesOfString:"^.*\\n" withString:"" options:1024 range:{0, theString's |length|()} --remove first paragraph
	set thePattern to "(?m)^.*?<ele>(\\d+\\.\\d)</ele>.*?<time>.*(\\d{2}:\\d{2}:\\d{2}).*?</time>.*?<gpxtpx:hr>(\\d+)</gpxtpx:hr>.*?$"
	set theString to theString's stringByReplacingOccurrencesOfString:thePattern withString:"$1,$2,$3" options:1024 range:{0, theString's |length|()} --a CSV
	set theArray to (theString's componentsSeparatedByString:linefeed)
	
	set theData to current application's NSMutableArray's new()
	set theKeys to current application's NSArray's arrayWithArray:{"Elevation", "PlotTime", "HeartRate"}
	repeat with anItem in theArray
		set keyValues to (anItem's componentsSeparatedByString:",")
		set theDictionary to (current application's NSDictionary's dictionaryWithObjects:keyValues forKeys:theKeys)
		(theData's addObject:theDictionary)
	end repeat
	
	return theData as list
end getData

BTW, given the nature of the data, I’m curious if a list of records is really needed and if a list of lists (which is probably simpler and faster) would better do the job. Obviously, that’s for Simon to decide.

Wow! The speed increases are impressive. The use of the NS variables etc is beyond me at the moment as are complex regex patterns but I’ll work my way through all the examples as a learning exercise.

I’m curious if a list of records is really needed and if a list of lists (which is probably simpler and faster) would better do the job. Obviously, that’s for Simon to decide.

Given that each inner list would be three items a list of lists would be o.k. I chose to use records to aid clarity.

My aim is to extend the script to draw a graph using OmniGraffle which replicates a graph that my Suunto HRM app draws but refuses to export. I have a similar script that plots an X Y graph and this new one will plot two lines one for heart rate and the other for altitude. The icing on the cake is to add line colour according to heart rate zone.

Thanks again for all your work and scripts.

Simon

I have another proposal, not base on AS but on Unix tools available in MacOS, precisely AWK language.

Analysing the input file structure I’ve notified that it consist of segments like below

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<gpx xmlns="http://www.topografix.com/GPX/1/1" xmlns:gpxtpx="http://www.garmin.com/xmlschemas/TrackPointExtension/v1" xmlns:xsi="http://www.w3.org/2001/XMLSchema
<metadata><name>suuntoapp-Hiking-2026-01-29T09-57-12Z</name><desc/><author><name>Mick Mouse</name></author></metadata>
<trk>
  <name>suuntoapp-Hiking-2026-01-29T09-57-12Z</name>
  <trkseg>
    <trkpt lat="54.462942" lon="-3.022662">
      <ele>71.0</ele>
      <time>2026-01-29T09:57:28Z</time>
      <extensions>
        <gpxtpx:TrackPointExtension>
          <gpxtpx:hr>94</gpxtpx:hr>
        </gpxtpx:TrackPointExtension>
      </extensions>
    </trkpt>

    <trkpt lat="54.462882" lon="-3.022667"><ele>70.8</ele><time>2026-01-29T09:57:31Z</time><extensions><gpxtpx:TrackPointExtension><gpxtpx:hr>93</gpxtpx:hr></gpx
  ...
  </tkpseg>
</trk>

Because it is one line, we can use sequence >< to split for creating lines with one field what gives something like that

trkpt lat="54.462835" lon="-3.022667"
ele>70.8</ele
time>2026-01-29T09:57:35Z</time
extensions
gpxtpx:TrackPointExtension
gpxtpx:hr>96</gpxtpx:hr
/gpxtpx:TrackPointExtension
/extensions
/trkpt

and use field separator like <|>, but in awk it is regular extension, so it means < OR >. In this way the fields for line begins with time are

  1. time
  2. 2026-01-29T09:57:35Z
  3. /time

and we are interested only in 2nd field, defined in AWK as $2

The proposed solution:

  • split line in places where >< exists (removing them)
  • get only lines starting with markers for time, heart rate and elevation, the filtering regular expression will start with ^ what means “lines begin with …”
  • save two fields for two lines as variables (for change sequence later)
  • print result when last field in sequence (gpxtpx:hr) arrives
  • other lines will be silently ignored

The AWK program (gpxfile.awk):

BEGIN { RS="><"; FS="<|>" }

/^ele/       { elevation = $2 }
/^time/      { time =  $2 }
/^gpxtpx:hr/ { print time, elevation, $2 }

and running it in command line like (producing output file with results):

awk -f  gpxfile.awk input-file > <output-file

Total time measured by tool time on MacBook Air M1 is 72 ms

0,06s user 0,00s system 83% cpu 0,072 total

Other functions of AWK gives the possibility to reformat output or get only parts of time (or preset it in another form).

Of course it is different way and someone who has practice with AppleScript may prefer solution based on it.

For integration with GUI system it can be put into Shortcuts as Run Shell Script, prepending it by step which will select the file.

Exploring ObjC’s XML possibilities a little further for my own interest …

A few edits later: This latest version’s very fast indeed: :slightly_smiling_face:

use AppleScript version "2.4" -- OS X 10.10 (Yosemite) or later
use framework "Foundation"
use scripting additions

set theFile to (choose file with prompt "Please choose a file:" of type {"gpx"})
parseGPX(theFile)

on parseGPX(GPXFile)
	set GPXURL to current application's class "NSURL"'s fileURLWithPath:(GPXFile's POSIX path)
	set XML to current application's class "NSXMLDocument"'s alloc()'s initWithContentsOfURL:(GPXURL) options:(0) |error|:(missing value)
	script o
		property eleValues : (XML's objectsForXQuery:("//ele/string()") |error|:(missing value)) as list
		property timeValues : (XML's objectsForXQuery:("//time/substring(string(), 12, 8)") |error|:(missing value)) as list
		property hrValues : (XML's objectsForXQuery:("//gpxtpx:hr/string()") |error|:(missing value)) as list
		property output : eleValues -- Reuse an existing list for the output.
	end script
	repeat with i from 1 to (count o's eleValues)
		set o's output's item i to ¬
			{PlotTime:o's timeValues's item i, Elevation:o's eleValues's item i, HeartRate:o's hrValues's item i}
	end repeat
	
	return o's output
end parseGPX

I finally have some time to play with your example of AWK which was new to me. I have made a few changes to your Awk file so that the output is tab separated.

Here is my commented version:

BEGIN { RS="><"; FS="<|>|T|Z"; OFS = "\t" }
# RS or record separator defines the characters that split the data into records
# FS or field separator sets the characters that split a record into fields
# OFS output field separator

# AWK generally runs based on lines in the form: pattern {action}
# $n is the field counter so $2 is the second field of a record

# patterns are written between "/" characters
# its a little like a spreadsheet if statement with no else
# The part before the  "{" character may be an expression.

# the "^" character means at beginning of record - see RegEx.
# in this example the data is always in the second or third field.

/^ele/       { elevation = $2 }
/^time/      { date =  $2 }
/^time/      { time =  $3 }
/^gpxtpx:hr/ { print date, time, elevation, $2 }

Simon

Yes it is very quick - I just wish I understood how it does it.