All,
The script takes as input a plain text file. Each text begins with a citationKey for a book. I then write all the notes on the book in this one text document: every note is preceded by a page notation (e.g. @88) and (except the last note) finishes with a unique delimiter: $%$
The script reads the file, and splits it into chunks. For each chunk it adds the pagenumber to the citationKey, it combines text and citationKey, and writes these to separate files, which are named according to a reworked formatting of the citationKey. This to ensure that notes are always hard-linked to their source.
set theFile to (choose file with prompt "Select a file with notes to split into separate files:")
open for access theFile
set someText to (read theFile)
close access theFile
-- Example citationKey: {Wessel, 2009, #82957}
-- Example page number: @88
-- Delimiter used in original file: $%$
on explode(delimiter, someText)
set prevTIDs to AppleScript's text item delimiters
set AppleScript's text item delimiters to delimiter -- This is the delimiter string that will split the text into chunks
set output to text items of someText -- After splitting, the various chunks of text have now become individual 'items'
set AppleScript's text item delimiters to prevTIDs -- Reset the delimiter to default value
set itemCount to count of items of output -- itemCount stores the number of items (chunks)
set refNo to item 1 of output -- item 1 is always the citationKey copied from the Reference manager software
set refNo to characters 1 through ((length of refNo) - 1) of refNo as string -- Remove the closing curly brace from the citationKey
set refItem to last word of refNo -- Captures the clean number from the citationKey - From {Wessel, 2009, #82957} to 82957
set i to 1 -- i is set to control the repeat loop. We start at i = 1 instead of i = 0 to skip Item 1, which is the citationKey
try
repeat until i is itemCount
set i to i + 1 -- increment by one, until i becomes equal to itemCount, which halts the process
set targetFolder to "MiniHD:Users:ImNotApc:Documents:Bookshelf in Finder:"
set mainBody to item i of output -- Contents of each item are stored in local variable
set mainBodyLean to paragraphs 3 thru -1 of mainBody as string -- Due to the delimiter, some empty lines were added, these are here removed
set fileNameQuote to (characters 1 through 20 of mainBodyLean) as string -- Extract first 20 characters of the text to use in naming the file
set pageNo to second paragraph of mainBody -- The second paragraph contains the page number
set fullCite to refNo & pageNo & "}" -- Complete the citationKey, to include the page number
set fileNamer to "{#" & refItem & pageNo & "} " & fileNameQuote & "..." -- Complete the file name, to include full citationKey plus 20 characters plus ...
set writeIt to mainBodyLean & " " & fullCite -- Compose the content of each separate file, and include the full citationKey *after* the text it refers to
set fileSpec to (open for access file (targetFolder & fileNamer as text) with write permission) -- Routine to create a new file in the targetFolder in the Finder
write writeIt to fileSpec as text -- Writing the contents to the newly created file
close access fileSpec
end repeat
return "Succeeded in Writing to Separate Files in:" & targetFolder -- Confirmation that process completed in writing the files
on error
try
close access fileSpec
end try
return "Failed Writing to File."
end try
end explode
explode("$%$", someText)
Issues:
-
the first file the script writes triggers a strange utf-8 error message when I want to open that file in the Finder. TextEdit does not open it but tex-edit does and the correct chunk of text is all there. All consequtive files are correct plain text files. Does anyone know of a solution to this?
-
if a note (a chunk of text) consists of two or more paragraphs, these are compacted into 1. During the process, the paragraph-returns (or line endings?) are removed. Does that happen automatically because the System reads the file, rather than TextEdit? Is there a way to prevent this happening?
Many thanks!