Ready
Home
Editor
Operations
Converter
Security
Batch
OCR
OCR — Extract Text from Scans
Powered by Tesseract.js — runs entirely in your browser. Works on scanned PDFs, photographs, screenshots, and images. No data leaves your device.
ℹ️ First run: Tesseract.js (~4 MB) loads from CDN and is then cached locally in your browser. All subsequent uses are fully offline.
🔍
Drop a scanned PDF or image
PDF · JPG · PNG · TIFF · BMP · WebP · GIF
Change
Preparing…
Extracted Text
All pages
Offline Cache Manager
Download Tesseract engine and language data to your browser's local cache. Once cached, OCR works fully offline — no internet required.
The Tesseract.js engine (~700 KB script + ~4 MB per language) is downloaded once and stored in your browser. It persists across browser restarts until you clear browser data or remove it here.
Engine Status
Language Packs Cached
OCR No file loaded Engine: not loaded Freehtml.app PDF Suite
Leave OCR tool?
Any loaded file and extracted text will be cleared.
Keyboard Shortcuts
Freehtml.app PDF Suite — OCR Tool
Files
Open file picker
O
Reset / clear file
Escape
OCR
Docs
Run OCR
Enter
Copy all extracted text
CtrlC
Download as .txt
CtrlS
Navigation
Go to Home
H
Go to Editor
E
Go to Batch
B
Appearance
Toggle dark mode
D
Show shortcuts
?
Close this panel
Escape
Shortcuts are active when no text input is focused.