Step-by-Step: Splitting a Scanned Book PDF Into Chapters
You finally scanned that 400-page textbook. The scanner did its job — one massive PDF, every page captured. But now you're staring at a single 180 MB file and you need chapter 7 for your study group, chapter 12 for a colleague, and the introduction for yourself. Opening the whole thing on a phone is painful. Sharing it is worse.
Splitting a scanned book PDF into individual chapters sounds simple, but there's a catch scanned books have that regular PDFs don't: there's no text layer, no bookmarks, no metadata — just images of pages. You can't rely on a tool that searches for the word "Chapter" because it doesn't see words at all. You're navigating purely by visual inspection and page numbers.
This guide walks you through the whole process, from figuring out where chapters begin and end, to actually making the splits, to keeping the output files manageable in size. No programming required.
Before You Start: What You Actually Need to Know
A scanned PDF is a stack of JPEG or PNG images bundled into a PDF wrapper. Every "page" in the PDF corresponds to one image. The good news: splitting by page range is completely reliable regardless of whether the content has a text layer. You're just rerouting image pages into new files.
The main task before any splitting is building a chapter map — a simple list of which page numbers correspond to which chapters. Do this once and everything else becomes mechanical.
Step 1: Build Your Chapter Map
Open the PDF in whatever viewer you have — Adobe Acrobat Reader, Preview on Mac, Foxit, even a browser. Go directly to the table of contents page of the book (not the PDF page number, the printed page number).
Here's a subtlety that trips people up: scanned books often have front matter — title page, copyright page, dedication, TOC — before the actual page numbering begins. Page 1 of the book might be PDF page 9 or PDF page 14 depending on how much front matter exists.
So do this: find a page you can identify confidently (say, the book's "Chapter 1" heading page). Look at the PDF page counter in your viewer (usually shown at the bottom or top). Write that down. Then look at what printed page number appears at the bottom of that page in the scan itself. The difference between these two numbers is your offset.
Example: The PDF viewer says you're on page 11. The printed number at the bottom of that page says "1". Your offset is 10. Every printed page number + 10 = the actual PDF page you need to cut at.
Now go through the table of contents and build a simple table:
Chapter 1 — PDF pages 11–34 Chapter 2 — PDF pages 35–61 Chapter 3 — PDF pages 62–88 ...
Write this down or paste it in a notes file. This is your master map. Don't skip this step — it's what makes everything downstream clean.
Step 2: Choose Your Splitting Tool
You have a few good options depending on your setup:
Option A: Online PDF Split Tools (Easiest, No Install)
If your PDF isn't confidential, browser-based tools are fast. Tools like Smallpdf, ILovePDF, or PDF2Go have a "split by page range" feature. You enter the start and end page, click split, download the result. Repeat for each chapter.
The downside: you have to upload a potentially large file repeatedly (once per chapter). If your scan is 200 MB and you have 15 chapters, that's a lot of uploading. If the book contains any private or sensitive content, uploading to a third-party server isn't ideal.
Option B: Adobe Acrobat (If You Have It)
Acrobat Pro has an "Organize Pages" view where you can visually select pages and extract them. It's reliable, handles large files well, and lets you extract multiple ranges in one session. If your organization has a license, use it.
Option C: PDF24 (Free, Local App Option)
PDF24 has both a web version and a downloadable desktop app (Windows). The desktop version processes everything locally — nothing leaves your machine. It has a split tool where you specify exact page ranges and name the output files. Highly recommended if you're doing this on a personal or sensitive document.
Option D: pdftk or Ghostscript (Command Line, Any OS)
If you're comfortable with a terminal, pdftk is the cleanest option for batch work. Once installed:
pdftk input.pdf cat 11-34 output chapter_01.pdf
pdftk input.pdf cat 35-61 output chapter_02.pdf
You can script this across all chapters in about 10 minutes if you have your chapter map ready. On Mac, you can install pdftk via Homebrew (brew install pdftk-java). On Linux it's usually in the package manager.
Step 3: Do the Splits
Whichever tool you chose, work through your chapter map one entry at a time. Name the output files consistently so they sort correctly in any folder view:
chapter_01_introduction.pdf chapter_02_background.pdf chapter_03_methodology.pdf
Using zero-padded numbers (01, 02, 03 rather than 1, 2, 3) means file explorers will sort them in order. A small thing that matters when you have 14 chapters.
After each split, open the resulting file and check three things: the first page, a page in the middle, and the last page. Make sure you didn't accidentally cut a page short or include the first page of the next chapter.
Step 4: Handle the File Size Problem
Scanned PDFs are heavy. A 400-page scan at 300 DPI might be 150–200 MB total, which means individual chapter files can still be 15–30 MB each. That's workable on a laptop but annoying on mobile, slow to email, and sometimes over attachment limits.
Compress each chapter file after splitting. The same tools that split can compress. In Ghostscript:
gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite \
-dCompatibilityLevel=1.4 \
-dPDFSETTINGS=/ebook \
-sOutputFile=chapter_01_compressed.pdf \
chapter_01.pdf
The /ebook setting targets roughly 150 DPI — good for screen reading. Use /printer (300 DPI) if people will print these. You'll typically see 40–60% file size reduction on scanned content, sometimes more.
If command line isn't your thing, the same online tools (Smallpdf, PDF Compressor, PDF24) have compression modes. Just run the chapter files through after splitting.
Step 5: Add Basic Metadata (Optional but Useful)
Right now each chapter file is a blank PDF with a filename. If you're sharing these with others, it helps to set the PDF title property so when someone opens it in Acrobat or a reader, the title bar shows "Chapter 3: Methodology" instead of "chapter_03.pdf".
In Acrobat: File → Properties → Description tab. Type in the title and author.
With pdftk you can do this with a metadata update file, though it's slightly more involved. For most personal use cases, good filenames are enough.
Troubleshooting: Common Problems
The chapter breaks are off by one or two pages
This almost always comes down to the offset calculation being wrong, or the table of contents page itself being included in or excluded from a chapter. Double-check your offset by confirming two known page numbers, not just one. Also check whether the TOC in the book lists the chapter start as the heading page or the first content page — sometimes those differ.
Some pages are rotated sideways
Scans of landscape charts, appendix tables, or oversized pages often end up rotated. This is a property of each individual page in the PDF. You can fix this in Acrobat's Organize Pages view (right-click → Rotate) or with pdftk's rotate command. Do this before splitting so you only have to fix the original once.
Output files are still huge even after compression
Check the original scan resolution. If someone scanned at 600 DPI for archival purposes, the files will be large. The /ebook Ghostscript preset can be aggressive about downsampling — try /printer if quality is suffering, or accept the larger size for high-resolution needs.
The split tool counts pages differently than my viewer
Some tools count from 1, some label pages based on embedded page labels in the PDF. If your splits keep coming out wrong, open the PDF in Acrobat Reader and look at what it shows in the page panel on the left — that's the ground truth for how the PDF numbers its pages internally. Match your split ranges to that.
Keeping It Organized
Once you've finished splitting, put everything in a folder structure that will still make sense six months from now:
/ScannedBook_Title/
original_full_scan.pdf
chapter_map.txt
/chapters/
chapter_01_introduction.pdf
chapter_02_background.pdf
...
Keep the original. Keep the chapter map. If you later discover your offset was wrong, you can re-split from the original without starting over.
The chapter map text file is worth keeping because you'll forget the offset within a week. Future you will be grateful it's there.
Splitting a scanned book PDF is repetitive work, but it's not complicated once you've built that initial chapter map. The offset calculation is the only part that requires careful attention. Everything after that is just applying page ranges and letting the tool do the cutting. An hour of focused work on a 20-chapter book will leave you with a clean, organized set of files that are genuinely easier to use than the original monolith.