What does OCR Text Recognition not protect?

It extracts text only — it does not create a searchable PDF layer. Output quality varies with scan resolution, language selection, and font clarity. Below 70% confidence, expect word-level errors.

What this does not protect

Tesseract.js runs in a Web Worker. Each page consumes ~50-100MB of RAM during processing. Documents over 50 pages may cause memory pressure on devices with less than 4GB free.
Handwritten text is poorly supported. Tesseract is designed for printed text. Expect less than 30% accuracy on handwriting.
Multi-column layouts are partially supported. Tesseract reads left-to-right by default and may interleave columns on complex layouts.
Tables are not preserved structurally. Cell contents are extracted as text, but row/column relationships are lost.
It cannot fix compromised devices, accounts, or unsafe sharing channels.