Input
Drag & drop an image or PDF
JPG, PNG, WebP, PDF
or paste a screenshot (Ctrl+V)
Output
Recognized text will appear here
Add a file on the left, click Recognize
Extract text from images and PDF files. Tesseract.js in your browser, 100+ languages, TXT/DOCX export. 100% client-side.
Drag & drop an image or PDF
JPG, PNG, WebP, PDF
or paste a screenshot (Ctrl+V)
Recognized text will appear here
Add a file on the left, click Recognize
OCR (Optical Character Recognition) is the technology that turns image content into machine-readable text. Drop a photo of a receipt, a contract scan, a screenshot of a spreadsheet, or a PDF — and this tool will extract the text so you can copy it, save as .txt, or .docx. Everything runs 100% in your browser via Tesseract.js compiled to WebAssembly. Your documents never leave your device.
Yes. The entire OCR process runs only in your browser. Files are not sent to any server — they never leave your device. This matters for scans of contracts, ID documents, receipts, or medical records. No upload-based online OCR can give you that guarantee.
English and Polish are built-in. Other languages (German, Ukrainian, French, Spanish, and 100+ more available via Tesseract) download on demand — language models weigh ~10-15 MB and are cached in the browser, so subsequent uses are instant.
Excellent. The built-in Polish model (pol.traineddata) recognizes all Polish characters. For best quality use Accurate mode and make sure the image is sharp — high-quality scans give 95%+ confidence, phone photos typically 85-90%.
Yes. You can upload a multi-page PDF — each page is rendered and processed sequentially. The output is combined text from all pages with page separators.
The first run downloads the OCR engine (~4 MB) and language model (~10-15 MB per language). Files are cached by the browser — subsequent uses with the same language are instant. You can use it offline after the first download.