Dividing a pdf document in half repeatedly

Looking for the formula to make this splitting happen:

                                                         -----------------------------------
                                                         |                                 |
                                                         |    5000 page doc          |
                                                         |                                 |
                                                         ------------------------------------
                                                               |                       |
                                                               |                       |
                                                               \/                      \/
                                                   ---------------------         ----------------------
                                                   |                   |        |                    |
                                                   |    2500  pg    |        |   2500  pg      |
                                                   |                   |        |                    |
                                                   ---------------------         ----------------------
                                                     |              |              |                 |
                                                     |              |              |                 |
                                                     \/             \/             \/                \/
                                          |-----------------|   |-----------|   |---------|    |---------|
                                          |    1250 pg   |   | 1250 pg|   |1250 pg|  |1250 pg|
                                          |-----------------|   |-----------|   |-----------|  |---------|
                                                                               
                                                                ---->Continue splitting to as many levels as needed.

It does not have to be Acrobat specific; just the principle of doing this kind of dividing of a bigger doc to a smaller doc. Some kind of repeat loop that 1st year computer majors learn, I am guessing.

The formula:


my split_Pdf_To_2_Parts(myPDF)

on split_Pdf_To_2_Parts(myPDF)
	-- 1) Get number of pages of myPDF
	-- 2) Divide (number of pages)/2 -- Integer dividing!!!
	-- 3) If result > 0 then
	--------- split here the myPDF to 2 parts -- result, as say, will myPDF1 and myPDF2
	--------- my split_Pdf_to_2_Parts(myPDF1) -- first recursive call of handler
	--------- my split_Pdf_to_2_Parts(myPDF2) -- second recursive call of handler
	----- end if
end split_Pdf_To_2_Parts

There are many experienced scripters on this site. Maybe someone will post a real script. I could do this too, but I still cannot understand the practical benefits of this type of division.

It may help to look at :

https://macscripter.net/viewtopic.php?id=46089

Yvan KOENIG running High Sierra 10.13.6 in French (VALLAURIS, France) lundi 5 aout 2019 20:53:53

A similar question about splitting PDFs was asked by this poster, where they wanted to supply a specific section count and apportion the slight difference.
https://macscripter.net/viewtopic.php?id=44363

Exactly halved break points can be done as simply as this:

set totes to 5000
repeat with value from totes to 0 by (-totes / 2)
	log value
end repeat

#edited for comment generalization rather than directing to OP, as this is another resurrected thread from years ago

OK, now I understand, what the OP want. He say in post that:
“Looking for the most efficient way to tag and open the subsequent halved pdfs to split in half before I start working thru it.”

But I have not installed “Acrobat Pro” on my computer to fix the OP’s script. The formula is provided and that is.

I revised my script as follows:

  • The user can set the levels to which the source PDF is split.

  • The number of pages in the last split PDF file is increased or decreased to account for an uneven number of pages (e.g. a 33-page PDF split into two parts).

  • The split files are placed in individual folders.


--Set the number of levels to split the source PDF.
set splitLevels to 2

--Set path to PDF utility.
set pdfUtility to "/usr/local/bin/cpdf"

--Prompt for source PDF.
set pdfFile to choose file of type {"com.adobe.pdf"} without invisibles
set pdfPosixFile to POSIX path of pdfFile

--Set variable to name of source PDF.
tell application "Finder"
	set pdfName to (name of pdfFile)
	set pdfName to text 1 thru -5 of pdfName
end tell

--Set variable to the number of pages in the source PDF.
set pageCount to (do shell script pdfUtility & " -pages " & quoted form of pdfPosixFile)

--Notify if source PDF does not contain enough pages to split.
if ((2 ^ splitLevels) as integer) > pageCount then
	set dialogText to "The source PDF contains " & pageCount & " pages and cannot be split."
	display dialog dialogText buttons {"OK"} cancel button 1 default button 1 ¬
		with title "PDF Split" with icon stop
end if

--Set variable to target parent folder for split PDF files.
set targetFolder to POSIX path of (path to desktop folder as text) & pdfName & "/"

--Create target folders.
repeat with i from 1 to splitLevels
	do shell script "mkdir -p " & quoted form of (targetFolder & "Split " & i)
end repeat

--Loop through each split level.
repeat with i from 1 to splitLevels
	set splitNumber to (2 ^ i) as integer
	set chunkSize to pageCount div splitNumber
	set fileNumber to 1
	set startPage to 1
	
	--Create split PDF files.
	repeat with j from 1 to splitNumber
		set destinationFile to targetFolder & "Split " & i & "/Page Split " & chunkSize & " - " & fileNumber & ".pdf"
		if j < splitNumber then
			do shell script pdfUtility & " " & quoted form of pdfPosixFile & " " & startPage & "-" & ¬
				(startPage + chunkSize) & " -o  " & quoted form of destinationFile
			set fileNumber to fileNumber + 1
			set startPage to startPage + chunkSize
		else
			do shell script pdfUtility & " " & quoted form of pdfPosixFile & " " & startPage & "-end -o " & ¬
				quoted form of destinationFile
		end if
	end repeat
end repeat

Seems interesting but, where may we get cpdf ?
I found a site offering it but the result is a file named : cpdf.dms
and I don’t know how to expand it.

Yvan KOENIG running High Sierra 10.13.6 in French (VALLAURIS, France) mardi 6 aout 2019 19:28:34

Oops, I got it from : http://community.coherentpdf.com

In vain torment. There is no practical benefit from this. This is just a problem, not having an exact solution (formula) in mathematics, but having an exact algorithm in programming. This is just a puzzle. In addition, your script should be recursive, asking the user “Stop spitting?”, And along the way, it should create new subfolders for new divisions.

NOTE: I also did not know that there are utilities that can split PDF. Here’s one benefit - I’ve learned something else. :slight_smile:

@KniazidisR

In message #2 I pointed to a piece of code which split PDF in pages.
I don’t know if it may be edited to split a document into two smaller ones.
Of course we may split in pages then concat the first half of pages in a file then the second half in an other file but it sounds awful.

Yvan KOENIG running High Sierra 10.13.6 in French (VALLAURIS, France) mardi 6 aout 2019 19:41:27

The Shane’s script is the best for dividing to all pages. But for this concrete task is better use shell variant. A task that I probably never use …

I’m always hesitant to post scripts that use utilities like cpdf, because I might be seen as promoting them. I guess if there’s an issue, Nigel will inform me, but, just for the record, I have no connection with cpdf.

It should also be noted that cpdf is an expensive commercial utility. The author has released a free, not-for-commercial-use version of the utility and that’s what I use.

Anyways, the utility is available as a compiled binary on GitHub:

https://github.com/coherentgraphics/cpdf-binaries

The page on which the developer notes the “special not-for-commercial-use license” is:

https://community.coherentpdf.com

It can:

use AppleScript version "2.4"
use scripting additions
use framework "Foundation"
use framework "Quartz" -- required for PDF stuff

set inPath to POSIX path of (choose file with prompt "Choose a PDF file:")
its splitInHalfInPath:inPath

on splitInHalfInPath:inPath
	--  make URL of the PDF
	set inNSURL to current application's |NSURL|'s fileURLWithPath:inPath
	-- get extension-less name of PDF
	set docName to (inNSURL's lastPathComponent()'s stringByDeletingPathExtension()) as text
	-- make URL of output folder
	set outFolderNSURL to inNSURL's URLByDeletingLastPathComponent()
	-- make PDF document from the file
	set theDoc to current application's PDFDocument's alloc()'s initWithURL:inNSURL
	-- count the pages
	set theCount to theDoc's pageCount() as integer
	set doc1Count to (theCount + 0.5) / 2
	-- build new document URLs
	set newDocName1 to docName & "-1.pdf"
	set outNSURL to (outFolderNSURL's URLByAppendingPathComponent:newDocName1)
	set newDocName2 to docName & "-2.pdf"
	set outNSURL2 to (outFolderNSURL's URLByAppendingPathComponent:newDocName2)
	-- make new PDF document
	set theNewPDFDocument to current application's PDFDocument's new()
	repeat with i from 1 to doc1Count
		-- get first page of old PDF
		set thePDFPage to (theDoc's pageAtIndex:0) -- zero-based indexes
		(theDoc's removePageAtIndex:0)
		(theNewPDFDocument's insertPage:thePDFPage atIndex:(i - 1))
	end repeat
	-- save the new PDF
	(theNewPDFDocument's writeToURL:outNSURL)
	(theDoc's writeToURL:outNSURL2)
end splitInHalfInPath:

Rather than repeated opening-closing for repeat splits it would be quicker to do all the calculations first, but calling this repeatedly is probably fast enough anyway.

Thanks, Shane for this. As you say dividing on 2 parts may be quicker, I will try apply your script to build complete recursive process and to present here soon.

OK, UPDATE. The following is the RECURSIVE variant of the Shane’s script. Without creating subfolders (for better speed). The last task may be implemented easy further, related on resulting PDF’s names:


use AppleScript version "2.5"
use scripting additions
use framework "Foundation"
use framework "Quartz" -- required for PDF stuff
property |NSURL| : a reference to |NSURL| of current application
property PDFDocument : a reference to PDFDocument of current application
global outFolderNSURL, theParts

set inPath to POSIX path of (choose file with prompt "Choose a PDF file:")
display dialog "How much parts you need?" default answer "2"
set theParts to (text returned of result) as integer
set inNSURL to current application's |NSURL|'s fileURLWithPath:inPath --  make URL of the PDF
set outFolderNSURL to inNSURL's URLByDeletingLastPathComponent() -- make URL of output folder
my splitInHalfInPath(inNSURL, 1)

on splitInHalfInPath(inNSURL, myCount)
	repeat while myCount < theParts
		-- get extension-less name of PDF
		set docName to (inNSURL's lastPathComponent()'s stringByDeletingPathExtension()) as text
		-- make PDF document from the file
		set theDoc to PDFDocument's alloc()'s initWithURL:inNSURL
		-- count the pages
		set theCount to theDoc's pageCount() as integer
		set doc1Count to (theCount + 0.5) / 2
		-- build new document URLs
		set newDocName1 to docName & "-1.pdf"
		set outNSURL to (outFolderNSURL's URLByAppendingPathComponent:newDocName1)
		set newDocName2 to docName & "-2.pdf"
		set outNSURL2 to (outFolderNSURL's URLByAppendingPathComponent:newDocName2)
		-- make new PDF document
		set theNewPDFDocument to PDFDocument's new()
		repeat with i from 1 to doc1Count
			-- get first page of old PDF
			set thePDFPage to (theDoc's pageAtIndex:0) -- zero-based indexes
			(theDoc's removePageAtIndex:0)
			(theNewPDFDocument's insertPage:thePDFPage atIndex:(i - 1))
		end repeat
		-- save the new PDFs
		(theNewPDFDocument's writeToURL:outNSURL)
		(theDoc's writeToURL:outNSURL2)
		
		set myCount to myCount + myCount
		my splitInHalfInPath(outNSURL, myCount)
		my splitInHalfInPath(outNSURL2, myCount)
	end repeat
end splitInHalfInPath