with one click
pdf-verification
Verify PDF page counts and file integrity programmatically using PyPDF2 after generation
Menu
Verify PDF page counts and file integrity programmatically using PyPDF2 after generation
| name | pdf-verification |
| description | Verify PDF page counts and file integrity programmatically using PyPDF2 after generation |
After generating PDF files, always verify page counts and file integrity programmatically before declaring task completion. This catches formatting issues, empty pages, and corrupted files early.
pip install PyPDF2
Use run_shell to execute a Python verification script:
python -c "
from PyPDF2 import PdfReader
import sys
def verify_pdf(filepath, expected_pages=None):
try:
reader = PdfReader(filepath)
actual_pages = len(reader.pages)
print(f'✓ {filepath}: {actual_pages} pages')
if expected_pages and actual_pages != expected_pages:
print(f'✗ Page count mismatch: expected {expected_pages}, got {actual_pages}')
return False
# Check file is not empty/corrupted
if actual_pages == 0:
print(f'✗ Empty PDF: {filepath}')
return False
return True
except Exception as e:
print(f'✗ Error reading {filepath}: {e}')
return False
# Verify each PDF
results = []
results.append(verify_pdf('output.pdf', expected_pages=2))
print(f'All checks passed: {all(results)}')
sys.exit(0 if all(results) else 1)
"
For tasks with multiple PDFs, verify each one:
python -c "
from PyPDF2 import PdfReader
pdfs = {
'listings.pdf': 2,
'map.pdf': 1,
'summary.pdf': 1
}
all_passed = True
for filepath, expected in pdfs.items():
try:
reader = PdfReader(filepath)
actual = len(reader.pages)
status = '✓' if actual == expected else '✗'
print(f'{status} {filepath}: {actual}/{expected} pages')
if actual != expected:
all_passed = False
except Exception as e:
print(f'✗ {filepath}: {e}')
all_passed = False
print(f'Verification: {\"PASSED\" if all_passed else \"FAILED\"}')
"
Add file size validation to catch empty or near-empty files:
python -c "
import os
from PyPDF2 import PdfReader
filepath = 'output.pdf'
min_size = 1000 # minimum bytes
file_size = os.path.getsize(filepath)
if file_size < min_size:
print(f'✗ File too small: {file_size} bytes')
else:
reader = PdfReader(filepath)
print(f'✓ {filepath}: {len(reader.pages)} pages, {file_size} bytes')
"
# Final verification before marking task complete
python -c "
from PyPDF2 import PdfReader
import sys
required = {'listings.pdf': 2, 'map.pdf': 1}
passed = True
for f, pages in required.items():
try:
actual = len(PdfReader(f).pages)
if actual != pages:
print(f'FAIL: {f} has {actual} pages, expected {pages}')
passed = False
except Exception as e:
print(f'FAIL: Cannot read {f}: {e}')
passed = False
if passed:
print('SUCCESS: All PDFs verified')
sys.exit(0)
else:
sys.exit(1)
"
Delegate tasks to OpenSpace — a full-stack autonomous worker for coding, DevOps, web research, and desktop automation, backed by an extensive MCP tool and skill library. Skills auto-improve through use, reducing token consumption over time. A cloud community lets agents share and collectively evolve reusable skills.
Incremental audio production with duration mismatch handling, adaptive stem extension, and pre-mix alignment verification
Audio production with diagnostic analysis, timecode parsing from documents, and verified export workflow
Incremental audio production with duration alignment handling, per-stem verification, and adaptive extension strategies
Step-by-step audio production with per-stem verification, timing alignment, and incremental quality gates
End-to-end audio production workflow with stems, effects, archiving, and verification