PDF Reader
Officialby Anthropic
The PDF skill teaches Claude how to handle any PDF-related task end to end. It covers reading and extracting text with layout preservation, pulling tables into structured data, merging or splitting documents, rotating pages, adding watermarks, password-protecting files, extracting embedded images, and running OCR on scanned PDFs to make them searchable. For PDF form filling it includes a full scripted workflow: detecting whether a form has fillable fields, extracting field metadata, converting pages to images for visual analysis, and writing values back to both fillable and non-fillable forms using coordinate-based text annotations. Internally it uses Python libraries (pypdf, pdfplumber, reportlab, pytesseract) and command-line tools (pdftotext, qpdf, pdftk), with eight ready-to-run Python scripts bundled in the skill.
Installation
Key Features
- ✓Extract text and tables from PDFs with layout preservation using pdfplumber
- ✓Merge, split, and rotate PDF pages with pypdf
- ✓Create new PDFs programmatically with reportlab including multi-page styled output
- ✓Fill PDF forms — both fillable fields and non-fillable forms via coordinate-based annotations
- ✓OCR scanned PDFs using pytesseract and pdf2image to make them searchable
- ✓Encrypt, decrypt, and watermark PDFs with password protection support
Use Cases
- →Extract structured tables from research reports or invoices into Excel or CSV format
- →Automate filling out government or HR forms including non-fillable scanned documents
- →Merge dozens of individual-page PDFs into a single document for archiving
- →Make scanned document archives searchable by running OCR across pages
- →Generate new PDF reports programmatically from data with multi-page styled layouts