But this can be up to hundreds of paragraphs/lines.
What I would like to do is to filter this whole structure and extract only the texts that belong under two specific titles, “Title_B” and ”Title_D”.
So from the above, the result to be a unified text of:
All the titles in the structure are the same and always in bold, but the sizes of the texts underneath vary.
So is there any way to filter this structure?
For the moment I am trying to do that with the following:
on offsets(needle, haystack)
script go
on |λ|(hay, iFrom)
set i to offset of needle in hay
if 0 ≠ i then
set iMatch to iFrom + i
if i < length of hay then
{iMatch} & |λ|(text (1 + i) thru -1 of hay, iMatch)
else
{iMatch}
end if
else
{}
end if
end |λ|
end script
go's |λ|(haystack, 0)
end offsets
epaminos. I don’t understand your request to “filter this whole structure.” That issue aside, the following script should do what you want.
set theText to paragraphs of (read (choose file))
main(theText)
on main(txt)
script o
property txtRef : txt
property extractedText : {}
end script
repeat with i from 1 to ((count o's txtRef) - 1)
set aLine to item i of o's txtRef
if aLine begins with "Title_B" or aLine begins with "Title_D" then
set end of o's extractedText to (removeWhitespace(item (i + 1) of o's txtRef))
end if
end repeat
set ATID to AppleScript's text item delimiters
set AppleScript's text item delimiters to linefeed
set extractedText to o's extractedText as text
set AppleScript's text item delimiters to ATID
return extractedText
end main
on removeWhitespace(aLine)
repeat ((count aLine) - 1) times
if text 1 of aLine is in {tab, space} then
set aLine to text 2 thru -1 of aLine
else
return aLine
end if
end repeat
end removeWhitespace