Does OCR work on handwriting?

Mostly no. Tesseract was trained on typeset, printed text and its accuracy on handwriting is significantly lower — typically below 50% on cursive or stylised handwriting. If your document is handwritten, you may get partial results on neatly printed block letters, but cursive writing, joined-up text, or personal handwriting styles will produce unreliable output. Dedicated handwriting recognition models (such as those from Google or Amazon) exist but are not included in this tool.

Why is my Urdu or Arabic text showing backwards or scrambled?

Urdu and Arabic are right-to-left (RTL) languages. Most plain text editors, including the output box in this tool, display text left-to-right by default, which can make RTL text appear reversed. If you copy the text into a word processor like Microsoft Word or Google Docs and set the paragraph direction to RTL, it should display correctly. Make sure you have selected the correct language (Urdu/urd or Arabic/ara) before extraction — using the English model on Arabic text is the most common cause of garbled output.

What image formats does this OCR tool support?

The tool accepts JPG (JPEG), PNG, WebP, BMP, and TIFF formats, up to 20 MB per image. For scanned documents, TIFF and PNG are preferred because they are lossless — no compression artefacts that could reduce OCR accuracy. JPG is fine for photographs but uses lossy compression that can blur fine strokes in small text.

What resolution do I need for good OCR accuracy?

The minimum recommended resolution for reliable OCR is 300 DPI (dots per inch) for standard-sized text. For very small fonts (below 10pt) or complex scripts like Chinese, Arabic, or Devanagari, aim for 400–600 DPI. If you are photographing a document with a smartphone, make sure the camera is close enough that text fills the frame — blurry or distant photos at effective 72 DPI equivalent will produce poor results.

For PDFs that already contain a text layer (most modern digital PDFs), use the PDF to Text tool (/tools/pdf-to-text) — it extracts text instantly without OCR. For scanned PDFs (where each page is an image), use the PDF to Image tool (/tools/pdf-to-image) to export each page as a PNG, then run each PNG through this OCR tool. Batch OCR across multiple PDF pages is not yet supported in a single click, but is on the roadmap.

How accurate is the text extraction?

Accuracy depends on language, image quality, and font type. For English text on a clean scan at 300 DPI or higher, you can expect 95–99% character accuracy — meaning fewer than 5 errors per 100 characters. For Latin-script European languages the accuracy is similarly high. For complex scripts (Arabic, Urdu, Devanagari, CJK) on good scans, accuracy is typically 85–95%. Low-resolution images, poor contrast, skewed scans, or decorative fonts can reduce accuracy to 60–75% in any language. The confidence score shown after extraction gives you a per-extraction estimate.

Can I extract text from multiple images at once?

Currently the tool processes one image at a time. Batch OCR across multiple images in a single session is planned for a future release. For now, upload each image separately and use the Download .txt option to save each result; you can then combine the text files manually.

What does the confidence score mean?

The confidence score is a percentage (0–100%) that Tesseract assigns to the entire recognition result. It reflects the average probability of each recognised character across the full output. A score above 80% generally indicates clean, reliable output. A score of 60–80% suggests some characters may have been misread, and the output is worth a quick review. Below 60% indicates the image quality or language selection may need adjustment — try re-scanning at higher resolution or confirm the correct language is selected.

Free Image to Text Converter (OCR) — Pixab AI

Extract text from images, photos and screenshots in 50+ languages including Urdu and Arabic. 100% private — runs in your browser.

Drop images here, or click to browse

JPG, PNG, WEBP, BMP, TIFF · max 20 MB · single file only

Language

Multi-language mode (up to 3 languages combined)

How it works

1Drop or select an image (JPG, PNG, WebP, BMP, or TIFF — up to 20 MB).
2Choose the language of the text in your image from the dropdown.
3Enable multi-language mode if the image contains two or three scripts.
4Click "Extract Text" — the OCR engine loads and runs locally in your browser.
5Review the extracted text in the editable output box and correct any errors.
6Copy to clipboard or download as a .txt or .docx file.

Frequently asked questions

How to Extract Text from an Image

Extracting text from an image with Pixab AI is a straightforward, fully private process that takes under a minute. No account is required, nothing is uploaded to a server, and the resulting text is immediately editable and downloadable. Here is how to get the best results from every scan.

Step 1 — Upload your image. Click the upload area or drag and drop your file directly onto the tool. Supported formats include JPG, PNG, WebP, BMP, and TIFF. The maximum file size is 20 MB. If you are working from a screenshot, you can paste it into a paint app first and save as PNG. For scanned documents, use the highest resolution scan you can produce — anything at 300 DPI or above will give excellent results.

Step 2 — Choose the language. The tool defaults to English. If your image contains text in another language — such as Urdu, Arabic, Hindi, or Chinese — select the correct language from the dropdown before running extraction. Selecting the correct language is the single most important factor after image quality: using the wrong language model causes garbled output even on a perfect image. The searchable dropdown lets you find any of the 50+ supported languages by typing a partial name.

Step 3 — Enable multi-language mode if needed. If your image contains text in more than one language — for example, an English caption under an Arabic headline — toggle on "Multi-language mode" and select up to three language codes. Tesseract will attempt to recognise all selected scripts simultaneously. Keep in mind that combining scripts increases processing time and can slightly reduce accuracy for each individual language.

Step 4 — Click Extract Text. The button triggers a lazy load of the Tesseract OCR engine and the relevant language model. On the first run, a progress bar shows download and initialisation steps: "Initializing…" → "Loading language model…" → "Recognizing text…". The language data (around 10 MB per language) is cached in your browser after the first download, so subsequent runs are significantly faster.

Step 5 — Review and edit the output. The extracted text appears in an editable text box. You can correct any recognition errors directly before downloading. Below the text, a stats bar shows the confidence score, word count, and processing time. A confidence score above 80% generally indicates clean, well-recognised output; below 60% suggests the image quality or language mismatch may need attention.

Step 6 — Download or copy. Use the Copy button to place the text on your clipboard, or download as a plain .txt file for use in any text editor. For richer formatting, choose "Download .docx" — the tool wraps each paragraph into a Microsoft Word-compatible document you can open immediately in Word, Google Docs, or LibreOffice. Once you have your text, try the Word Counter to measure it, or Find & Replace to clean up OCR errors in bulk.

Languages Supported

Pixab AI's OCR tool is powered by Tesseract, the same open-source engine that Google contributed to the community and which has been trained on data from universities and research institutions worldwide. It supports over 100 language models organised by script family.

Latin-script languages include English, Spanish, French, German, Italian, Portuguese, Dutch, Polish, Romanian, Swedish, Norwegian, Danish, Finnish, Croatian, Czech, Slovak, Slovenian, Catalan, Afrikaans, Indonesian, Malay, Vietnamese, and Turkish. Latin-script OCR typically achieves the highest accuracy because the character set is smaller and training data is abundant.

Arabic-script languages — Urdu and Arabic OCR. Urdu and Arabic are right-to-left languages using the Nastaliq and Naskh scripts respectively. Tesseract includes dedicated trained models for both. For Urdu text common in Pakistan — on newspapers, government certificates, textbook covers, and official letters — select "Urdu (urd)". For Modern Standard Arabic as well as Gulf, Egyptian, and Levantine printed text, select "Arabic (ara)". Note that handwritten Urdu and Arabic remain challenging for any OCR engine; printed text in a standard font produces much better results.

Indian-script languages — Hindi, Punjabi, Bengali OCR. Hindi uses the Devanagari script and is well supported with the "hin" language model. Punjabi in Gurmukhi script ("pan"), Bengali ("ben"), Gujarati ("guj"), Kannada ("kan"), Malayalam ("mal"), Tamil ("tam"), Telugu ("tel"), Marathi ("mar"), Nepali ("nep"), and Sinhala ("sin") are all available. These scripts have more complex ligature systems than Latin; accuracy improves significantly at higher image resolutions and with clear, printed fonts.

CJK languages — Chinese, Japanese, Korean. Chinese Simplified ("chi_sim") and Traditional ("chi_tra") are separate models because the character sets differ. Japanese ("jpn") supports kanji, hiragana, and katakana. Korean ("kor") uses the Hangul alphabet. CJK languages have very large character sets — thousands of distinct glyphs — which makes the language model files larger (~20 MB) and accuracy more sensitive to image quality.

Cyrillic-script languages include Russian, Ukrainian, Bulgarian, and Serbian. Greek, Hebrew, and Thai each have their own dedicated models as well. For languages not in the primary selector, scroll down to "Other languages" in the dropdown to see the full list.

Accuracy varies by language and image quality. Latin-script languages on clean scans typically achieve 95%+ confidence. Complex scripts on low-resolution images may yield 60–80% confidence. The confidence score shown after each extraction helps you judge whether the result is usable or whether you should re-scan at higher resolution.

What is OCR and How Does It Work?

OCR stands for Optical Character Recognition — the technology that converts images of text into machine-readable characters. The concept dates back to the 1950s when researchers first tried to automate the reading of printed text using photoelectric sensors. Early commercial systems in the 1970s and 1980s used template matching: the system held a database of what each letter looked like, and found the closest match pixel-by-pixel. These systems worked only with specific fonts and failed on any deviation.

Modern OCR is driven by LSTM neural networks (Long Short-Term Memory), a type of recurrent neural network that reads sequences of pixels and predicts character sequences based on learned context. Tesseract 4, released in 2018 and the engine behind this tool, shifted from the older pattern-matching engine to a full LSTM architecture. The network was trained on millions of text line images across all supported languages, learning both the shape of individual characters and the statistical likelihood of character sequences within each language.

The recognition pipeline has several stages. First, the image is binarized — converted to black and white — to separate text pixels from background. Then a page layout analysis step identifies text blocks, columns, and reading order. Individual text lines are extracted and fed into the LSTM recognizer, which outputs a probability distribution over all possible characters at each position. The most likely character sequence is selected, and the confidence score reflects the average probability across all recognised characters.

Factors that affect accuracy include resolution (more pixels per character = better), contrast between text and background (dark text on white is ideal), font type (serif and sans-serif printed fonts outperform decorative or handwritten ones), skew (text tilted more than a few degrees confuses layout analysis), and noise (scanning artefacts, coffee stains, and compression artefacts all reduce accuracy).

OCR Use Cases

Extracting text from scanned books and old documents. Libraries, researchers, and archivists use OCR to digitise printed books that have never existed in electronic form. A scanned page of a 1960s textbook becomes fully searchable and copy-pasteable after OCR. This is particularly valuable for out-of-print academic texts, regional-language literature, and historical legal documents.

Copying text from screenshots. Government portals, banking apps, and PDF viewers sometimes display text as images, making it impossible to select and copy. OCR solves this: take a screenshot, run it through the tool, and you have editable text within seconds. This is especially useful for copying reference numbers from government certificates, CNIC data, or utility bill details.

Digitizing printed receipts, bills, and letters. Accountants and small business owners photograph paper receipts for expense tracking. Instead of manually typing every line item, OCR extracts the text automatically. The extracted .docx file can be pasted directly into a spreadsheet or accounting tool.

Reading text from photos of signs or menus. Travellers photograph restaurant menus or street signs in foreign scripts and use OCR to extract the text before translating it. Selecting the correct language before extracting greatly improves the accuracy of the extracted characters.

Accessibility — making image-based content searchable. Many PDFs are created by scanning paper and saving as an image PDF — there is no underlying text layer, so screen readers cannot read them. If you have such a PDF, use PDF to Image to extract each page as an image, then run OCR on each page. Alternatively, for text-based PDFs, the PDF to Text tool extracts text directly without OCR.

Research — extracting quotes from photos of books. Students and researchers photograph pages of library books they cannot borrow. OCR converts these photos into searchable text, enabling full-text search across dozens of photographed pages and making citation extraction far faster.

Students in Pakistan and India — textbook photos. A very common use case across South Asia is photographing textbook pages and past-paper questions on a smartphone. OCR converts these into editable text that can be typed into WhatsApp study groups, translated, or reformatted. With Urdu and Hindi both supported, this tool covers the most common textbook languages used in Pakistani and Indian schools.

Image Quality Tips for Best OCR Results

Scan at 300 DPI or higher. DPI (dots per inch) determines how many pixels represent each character. Below 200 DPI, small characters become blurry blobs that the OCR engine cannot distinguish reliably. 300 DPI is the industry minimum for document OCR; 600 DPI for very small fonts or complex scripts like Chinese and Devanagari.

Ensure good lighting with no shadows. When photographing a document with a phone camera, place it on a flat surface under even lighting. Shadows cast by your hand or a nearby object create dark streaks that the binarization step misinterprets as text. A well-lit A4 page photographed flat produces results comparable to a flatbed scanner.

Keep the document straight. Tesseract performs deskew correction up to about 5 degrees, but severe rotation causes the layout analysis to fail. If you have a significantly tilted scan, rotate the image to near-vertical before uploading. Most phone cameras allow free rotation in the Photos app.

Maximise contrast between text and background. Black text on white paper is ideal. Coloured backgrounds, watermarks, and patterned paper all reduce contrast and accuracy. If your image has a coloured background, consider using a photo editor to increase contrast before OCR.

Avoid handwriting when possible. Tesseract was trained primarily on printed, typeset text. Handwriting — even clear, consistent handwriting — is a fundamentally different recognition problem. Results on handwritten content will be significantly less accurate. Dedicated handwriting OCR models exist but are not included in Tesseract.

Privacy and Security

Every step of the OCR process runs entirely inside your browser. When you click Extract Text, the tool lazy-loads the Tesseract engine and language model files from a CDN and stores them in your browser's cache. Your image is read into browser memory, processed locally by the Tesseract LSTM engine, and the resulting text is returned to the page — no part of your image ever leaves your device.

This is especially important when working with sensitive documents such as passports, CNIC cards, medical reports, financial statements, or confidential correspondence. Unlike cloud-based OCR services, there is no server log of your image, no temporary storage on a third-party system, and no risk of your document being retained or processed for training data.

OCR Comparison

vs Google Lens. Google Lens is convenient for quick one-tap translations on a phone, but it sends your image to Google's servers, requires a Google account to use on desktop, and gives you no editable text output or confidence score. Pixab AI runs offline in the browser after the initial model download and produces a full editable document.

vs Adobe Acrobat OCR. Adobe's OCR is high quality but locked behind an Acrobat subscription costing around $15–25 per month. Pixab AI is completely free with no subscription tier.

vs other online OCR sites that upload your image. Many free OCR websites — i2OCR, onlineocr.net, and others — work by uploading your image to their server. This creates a privacy risk: your document is transmitted over the network and processed on hardware you have no visibility into. Pixab AI processes everything client-side.

Our advantage: free, private, no account. Unlimited extractions, 50+ languages, .txt and .docx download, confidence scoring — all without creating an account or providing an email address.

Keep going

PDF Text Extractor

Extract all text from a PDF as plain text, with page structure preserved

Try now

QR Code Reader

Scan QR codes from images or your webcam and decode them instantly

Try now

Image to PDF

Convert JPG, PNG, WebP images into a single PDF document

Try now

Word Counter

Count words, characters, sentences and analyze text in real time