Breakout pages from PDF document

I have a number of mulitpage pdf documents.

I want to extract each page from a document and save it as a separate pdf using this name scheme:

originalPDFname-PageNumber.pdf

myPDFfile-1.pdf
myPDFfile-2.pdf
myPDFfile-3.pdf
myPDFfile-4.pdf

The only restriction I have is this needs to be able to run without the user needing to install any additional software.

Has anyone done this?

Ed. I’ve included my suggestion below. A lot of it is from Shane.

use framework "Foundation"
use framework "Quartz"
use scripting additions

set sourcePDF to POSIX path of (choose file with prompt "Select the source PDF file:")

splitPDF(sourcePDF)

on splitPDF(sourcePDF) -- based on handler written by Shane Stanley
	set sourcePDF to current application's |NSURL|'s fileURLWithPath:sourcePDF
	set sourcePDFName to sourcePDF's lastPathComponent()
	set destinationFolder to sourcePDF's URLByDeletingLastPathComponent
	set sourceDocument to current application's PDFDocument's alloc()'s initWithURL:sourcePDF
	set sourcePageBounds to (sourceDocument's pageAtIndex:0)'s boundsForBox:(current application's kPDFDisplayBoxMediaBox) -- media bounds of page 1 (may not be necessary)
	set sourcePageCount to sourceDocument's pageCount() as integer
	
	repeat with i from 1 to sourcePageCount
		set targetPDFName to addString(sourcePDFName, ("-" & i))
		set targetPDF to (destinationFolder's URLByAppendingPathComponent:targetPDFName)
		set thePDFPage to (sourceDocument's pageAtIndex:(i - 1))
		(thePDFPage's setBounds:sourcePageBounds forBox:(current application's kPDFDisplayBoxMediaBox)) -- if saved above
		set targetDocument to current application's PDFDocument's new()
		(targetDocument's insertPage:thePDFPage atIndex:0)
		(targetDocument's writeToURL:targetPDF)
	end repeat
end splitPDF

on addString(theFileName, theString)
	set theFileNameNoExtension to theFileName's stringByDeletingPathExtension()
	set theExtension to theFileName's pathExtension()
	return ((theFileNameNoExtension's stringByAppendingString:theString)'s stringByAppendingPathExtension:theExtension) as text
end addString
1 Like

This is perfect! Thanks, pevine! Here’s the handler I’ll paste into my script:

use framework "Quartz"
use scripting additions

set sourcePDF to POSIX path of (choose file with prompt "Select the source PDF file:")
set pdfPageFiles to splitPDF(sourcePDF)

on splitPDF(sourcePDF) -- based on handler written by Shane Stanley
	set sourcePDF to current application's |NSURL|'s fileURLWithPath:sourcePDF
	set sourcePDFName to sourcePDF's lastPathComponent()
	set destinationFolder to sourcePDF's URLByDeletingLastPathComponent
	set sourceDocument to current application's PDFDocument's alloc()'s initWithURL:sourcePDF
	set sourcePageBounds to (sourceDocument's pageAtIndex:0)'s boundsForBox:(current application's kPDFDisplayBoxMediaBox) -- media bounds of page 1 (may not be necessary)
	set sourcePageCount to sourceDocument's pageCount() as integer
	set exportedPDFpages to {}
	repeat with i from 1 to sourcePageCount
		set theIncrementString to ("-" & i)
		set theFileNameNoExtension to sourcePDFName's stringByDeletingPathExtension()
		set theExtension to sourcePDFName's pathExtension()
		set targetPDFName to ((theFileNameNoExtension's stringByAppendingString:theIncrementString)'s stringByAppendingPathExtension:theExtension) as text
		set targetPDF to (destinationFolder's URLByAppendingPathComponent:targetPDFName)
		set thePDFPage to (sourceDocument's pageAtIndex:(i - 1))
		(thePDFPage's setBounds:sourcePageBounds forBox:(current application's kPDFDisplayBoxMediaBox)) -- if saved above
		set targetDocument to current application's PDFDocument's new()
		(targetDocument's insertPage:thePDFPage atIndex:0)
		(targetDocument's writeToURL:targetPDF)
		set the end of exportedPDFpages to targetPDF as text
	end repeat
	return exportedPDFpages
end splitPDF

Ed. I’m glad that was helpful.

Just as a general aside, if the source PDF document contains pages of differing sizes, it’s probably best in most circumstances to disable the two script lines that get and set page bounds:

set sourcePageBounds to (sourceDocument's pageAtIndex:0)'s boundsForBox:(current application's kPDFDisplayBoxMediaBox)
(thePDFPage's setBounds:sourcePageBounds forBox:(current application's kPDFDisplayBoxMediaBox))

I cleaned up my script and added some code to prompt the user for the source PDF. I also removed two code lines that set page bounds. FWIW, I thought I would post the script here for anyone looking for an all-in-one solution.

I ran a timing test and it took the script 950 milliseconds to split Shane’s 159-page book on ASObjC. Considering the amount of work done, this seems a good result.

use framework "Foundation"
use framework "Quartz"
use scripting additions

on splitPDF() -- based on handler written by Shane Stanley
	set sourceFile to POSIX path of (choose file of type {"com.adobe.pdf"} with prompt "Choose a PDF file to split")
	set sourceFile to current application's |NSURL|'s fileURLWithPath:sourceFile
	set sourceFolder to sourceFile's URLByDeletingLastPathComponent
	set sourceFileName to sourceFile's lastPathComponent()
	set sourceFileNameNoExtension to sourceFileName's stringByDeletingPathExtension()
	set sourceFileExtension to sourceFileName's pathExtension()
	set sourceDocument to current application's PDFDocument's alloc()'s initWithURL:sourceFile
	set sourcePageCount to sourceDocument's pageCount() as integer
	repeat with i from 1 to sourcePageCount
		set targetFileName to ((sourceFileNameNoExtension's stringByAppendingString:("-" & i))'s stringByAppendingPathExtension:sourceFileExtension)
		set targetFile to (sourceFolder's URLByAppendingPathComponent:targetFileName)
		set aPDFPage to (sourceDocument's pageAtIndex:(i - 1))
		set targetDocument to current application's PDFDocument's new()
		(targetDocument's insertPage:aPDFPage atIndex:0)
		(targetDocument's writeToURL:targetFile)
	end repeat
end splitPDF

splitPDF()

Shanes original script can be found here

Note should be made of the comment above the code that saves the bounds of the first page of the PDF:

– store media bounds of page 1; unnecessary in most cases

1 Like

And here is the version I’ll be using to work with the Pages app, to export a PDF from a multi-page document and breakout the individual pages. This script saves the PDF in the same directory as the pages document, then creates a pages folder in the that directory where the pages land.

This version uses Shane’s filemanagerlib script library, but the splitPDF handler doesn’t require that. Just send it a sourcePDF and a directory for the pages to land in (both as POSIX paths) and it will do the rest.

use AppleScript version "2.4" -- Yosemite (10.10) or later
use scripting additions
use script "filemanagerlib"
use framework "Quartz"
tell application "Pages"
   tell its document 1
      set filePath to (its file)
      set fileInfo to (parse object filePath ¬
         with HFS results)
      set fileLocation to parent_folder_path of fileInfo
      set fileName to name_stub of fileInfo
      set pageFolderName to fileName & "-pages"
      set pageFolder to create folder at fileLocation ¬
         use name pageFolderName
      set pdfName to fileName & ".pdf"
      set pdfPath to fileLocation & pdfName
      export to pdfPath as PDF
      set pdfPath to POSIX path of pdfPath
      set thePDFPages to my splitPDF(pdfPath, pageFolder)
   end tell
end tell
tell application "Finder" to reveal filePath

on splitPDF(sourcePDF, destinationFolder) -- based on handler written by Shane Stanley as modified by MacScripter.net contributor Peavine
   set sourcePDF to current application's |NSURL|'s fileURLWithPath:sourcePDF
   set sourcePDFName to sourcePDF's lastPathComponent()
   set destinationFolder to current application's |NSURL|'s fileURLWithPath:destinationFolder
   set sourceDocument to current application's PDFDocument's alloc()'s initWithURL:sourcePDF
   set sourcePageCount to sourceDocument's pageCount() as integer
   set exportedPDFpages to {}
   repeat with i from 1 to sourcePageCount
      set theIncrementString to ("-" & i)
      set theFileNameNoExtension to sourcePDFName's stringByDeletingPathExtension()
      set theExtension to sourcePDFName's pathExtension()
      set targetPDFName to ((theFileNameNoExtension's stringByAppendingString:theIncrementString)'s stringByAppendingPathExtension:theExtension) as text
      set targetPDF to (destinationFolder's URLByAppendingPathComponent:targetPDFName)
      set thePDFPage to (sourceDocument's pageAtIndex:(i - 1))
      set targetDocument to current application's PDFDocument's new()
      (targetDocument's insertPage:thePDFPage atIndex:0)
      (targetDocument's writeToURL:targetPDF)
      set the end of exportedPDFpages to targetPDF as text
   end repeat
   exportedPDFpages
end splitPDF
1 Like

FYI, I’m using this to populate an inPosition printing template for very odd sized booklets. (Designed to fit in a shirt pocket or a back pocket. So far so good! Thanks!