Parsing hideous 'schedule expressions' in tabbed text files

Our school uses PowerSchool for student record managment. I need a way to parse some of the data that we export from the system so that I can import it into a different database.

When you export a list of classes, you get a tab-delimited text file something like this:

Course Number Section Number Teacher Room Expression
SCI107 2 2 225 3(A-C,E)
SCI107 4 2 225 4(A-B,D-E)
SCI107 1 2 226 3(A-C,E)
STU107 1 2 227 5(A-C,E)

For example, SCI107 section 2 meets in Room 225 and is taught by Teacher #3.
The ‘Expression’ is the gnarly part - that’s the info about what period & day of the class meetings.
In the first line under Expression, “3(A-C,E)” is parsed as follows:
3 = Period 3 which we call ‘C Block’
A-C,E = meets Mon, Tue, Wed, Fri (A = Monday, B = Tuesday, etc).

What I need to end up with is a matrix (.csv or tabbed text) for each room with free/busy times indicated in each cell.(blank if free, Course Number + Section Number if busy)

Help me Obi-Wan Kenobi, you’re my only hope!

Not too tricky using text item delimiters and parsing each paragraph at the tabs, but I think I need to know how many “blocks” there are in a day or I can’t determine the blanks. I’m heading for an answer that’s a string with lines delimited by returns and items delimited by tabs.

The script below illustrates some of the concepts for getting there:

-- items are all tab delimited
set tbl to paragraphs of "SCI107	2	2	225	3(A-C,E)
SCI107	4	2	225	4(A-B,D-E)
SCI107	1	2	226	3(A-C,E)
STU107	1	2	227	5(A-C,E)"
set newTable to "" -- blank string to build the answer in.
-- Extract the pieces from each line based on tab spacing
set tid to AppleScript's text item delimiters
repeat with oneP in tbl -- consider one line at a time
	set AppleScript's text item delimiters to tab
	set tParts to text items of oneP
	set tCourse to item 1 of tParts
	set tSection to item 2 of tParts
	set tRoom to item 4 of tParts
	set tBlock to text 1 of item 5 of tParts
	set tDays to text 3 thru -2 of item 5 of tParts -- leave out the parens.
	set wDays to getWeekDays(tDays)
	set newTable to newTable & tCourse & tab & tSection & tab & tRoom & tab & tBlock & tab & wDays & return
end repeat
set AppleScript's text item delimiters to tid

display dialog newTable

-- One of many possible ways to get the abbreviated days from the data stream
to getWeekDays(dayCode)
	set wkDays to {"Mon", "Tue", "Wed", "Thu", "Fri"}
	set tCodeDays to "ABCDE"
	set allDays to {}
	set tid to AppleScript's text item delimiters
	set AppleScript's text item delimiters to ","
	set tParts to text items of dayCode
	repeat with onePart in tParts
		if onePart contains "-" then
			set tstart to offset of (text 1 of onePart) in tCodeDays
			set tend to offset of (text 3 of onePart) in tCodeDays
			set span to ""
			repeat with k from tstart to tend
				set span to span & text k of wkDays & ", "
			end repeat
			set span to text 1 thru -3 of span -- drop the last ", "
		else
			set span to item (offset of onePart in tCodeDays) of wkDays
		end if
	end repeat
	set AppleScript's text item delimiters to tid
	return span
end getWeekDays

Here’s my take. :slight_smile: It returns a list containing individual “matrices” for each room, but it’s an easy matter to combined them into a single text if wished.

property blocksPerDay : 5 -- Adjust as required.

-- Given tabbed text in PowerSchool output format, return a list of "matrices" for the rooms involved. Each room matrix is a single text in which each column is a day from Monday to Friday and each row is a "block" in the day. Each "cell" contains a course and section number or is blank. Columns are tab delimited; rows are return delimited.
on rematrix(PowerSchoolOut)
	set matrixTemplate to makeTemplate(blocksPerDay) -- List template for "room" matrix.
	set theMatrices to {} -- Container for returned room matrices.
	set roomList to {} -- A note of already-started room matrices, for convenience.
	
	set theLines to PowerSchoolOut's paragraphs 2 thru -1 -- All the lines except the headers.
	
	set astid to AppleScript's text item delimiters
	set AppleScript's text item delimiters to tab
	repeat with thisLine in theLines
		set {courseNo, sectionNo, teacherNo, roomNo, expression} to thisLine's text items
		-- Construct the "course number and section number" text.
		set courseAndSection to courseNo & space & sectionNo
		-- If a matrix has already been started for this room, use it. Otherwise start one.
		if (roomNo is in roomList) then
			repeat with i from 1 to (count theMatrices)
				set roomMatrix to item i of theMatrices
				if (roomMatrix ends with {roomNo}) then exit repeat
			end repeat
		else
			copy matrixTemplate to roomMatrix
			set end of theMatrices to roomMatrix
			set end of roomMatrix to roomNo
			set end of roomList to roomNo
		end if
		-- Derive the block number and day numbers from the "expression" entry and insert the "course number and section number" text into the appropriate slots in the room matrix.
		set {blockNo, dayNos} to parseExpression(expression)
		repeat with dayNo in dayNos
			set item dayNo of item blockNo of roomMatrix to courseAndSection
		end repeat
	end repeat
	
	-- When all the matrix lists have been created and filled, sort them by room number (the last item in each list).
	script byLastItem
		on isGreater(a, b)
			(end of a) > (end of b)
		end isGreater
	end script
	CustomInsertionSort(theMatrices, 1, -1, {comparer:byLastItem})
	
	-- Convert each matrix from a list to a single text, adding day-of-week column names and block-letter row names.
	set horizontalHeaders to "Block" & tab & "Monday" & tab & "Tuesday" & tab & "Wednesday" & tab & "Thursday" & tab & "Friday	" & return
	repeat with thisRoom in theMatrices
		set AppleScript's text item delimiters to tab
		repeat with i from 1 to (count thisRoom) - 1
			set thisBlock to item i of thisRoom
			set item i of thisRoom to character id (i + 64) & tab & thisBlock
		end repeat
		set AppleScript's text item delimiters to return
		set thisRoom's contents to "Room " & end of thisRoom & ":" & return & horizontalHeaders & items 1 thru -2 of thisRoom
	end repeat
	
	set AppleScript's text item delimiters to astid
	
	return theMatrices
end rematrix

-- Create a matrix template in the form of a list of lists.
on makeTemplate(blocksPerDay)
	set template to {}
	set row to {}
	repeat 5 times -- 5 days per row
		set end of row to ""
	end repeat
	repeat blocksPerDay times -- blocksPerDay rows per matrix.
		copy row to end of template
	end repeat
	
	return template
end makeTemplate

-- Derive the block number and day number from the "expression" text.
-- "3(A-C,E)" means "block 3 on days 1, 2, 3, and 5".
on parseExpression(expression)
	set astid to AppleScript's text item delimiters
	set AppleScript's text item delimiters to "("
	set blockNo to (text item 1 of expression) as integer
	set dayExpression to text 1 thru -2 of text item 2 of expression
	set AppleScript's text item delimiters to ","
	set dayExpression to dayExpression's text items
	set AppleScript's text item delimiters to astid
	
	set dayNos to {}
	repeat with thisRun in dayExpression
		repeat with i from (id of character 1 of thisRun) to (id of character -1 of thisRun)
			set end of dayNos to i mod 8
		end repeat
	end repeat
	
	return {blockNo, dayNos}
end parseExpression

-- Customisable insertion sort.
on CustomInsertionSort(theList, l, r, customiser)
	script o
		property comparer : me
		property slave : me
		property lst : theList
		
		on isrt(l, r)
			set u to item l of o's lst
			repeat with j from (l + 1) to r
				set v to item j of o's lst
				if (comparer's isGreater(u, v)) then
					set here to l
					set item j of o's lst to u
					repeat with i from (j - 2) to l by -1
						tell item i of o's lst
							if (comparer's isGreater(it, v)) then
								set item (i + 1) of o's lst to it
							else
								set here to i + 1
								exit repeat
							end if
						end tell
					end repeat
					set item here of o's lst to v
					slave's shift(here, j)
				else
					set u to v
				end if
			end repeat
		end isrt
		
		on isGreater(a, b)
			(a > b)
		end isGreater
		
		on shift(a, b)
		end shift
	end script
	
	set listLen to (count theList)
	if (listLen > 1) then
		if (l < 0) then set l to listLen + l + 1
		if (r < 0) then set r to listLen + r + 1
		if (l > r) then set {l, r} to {r, l}
		
		if (customiser's class is record) then set {comparer:o's comparer, slave:o's slave} to (customiser & {comparer:o, slave:o})
		
		o's isrt(l, r)
	end if
	
	return -- nothing.
end CustomInsertionSort


set PowerSchoolOut to "Course Number	Section Number	Teacher	Room	Expression
SCI107	2	2	225	3(A-C,E)
SCI107	4	2	225	4(A-B,D-E)
SCI107	1	2	226	3(A-C,E)
STU107	1	2	227	5(A-C,E)"

rematrix(PowerSchoolOut)

Neat. I didn’t think of “set {courseNo, sectionNo, teacherNo, roomNo, expression} to thisLine’s text items” which is nicely compact and efficient.