Results 1 to 7 of 7
Neither my Simple Scan nor the Xsane image scan facility appears to be able to scan a typed document to editable text. Is there software available to enable this facility ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
- 02-10-2011 #1
- Join Date
- Aug 2010
How to scan to text file
Neither my Simple Scan nor the Xsane image scan facility appears to be able to scan a typed document to editable text. Is there software available to enable this facility on Ubuntu 10.10 using my HP Officejet 6313?
- 02-10-2011 #2
Look for optical character recognition software. I'm not sure all of what's out there. I know for GNOME there is OCRFeeder. I don't know if you can scan directly to text, but you should be able to convert your scanned doc to text with it.
- 02-10-2011 #3
Look for Tesseract, seems to be quite good. Needs a Tif image to convert to text.Charles
ASUS EEE Box B202, Atom 270 1,6GHz, 1 GB, HDD 80GB, XP-SP3 / PinguyOS
Asus EEE PC 901 with Bodhi-Linux
- 02-10-2011 #4
- Join Date
- Aug 2010
Many thanks. I managed to get limited text recognition via gscan2pdf but the OCR feeder works much better, only I have to first scan to pdf. It's a bit of a schlep compared to the MS Windows/HP version but it works.
- 02-11-2011 #5
change the file into pdf and try scan text area and copy to clipboard with gwenview(pdf viewer for KDE)...
- 02-11-2011 #6
Xsane can scan directly (well, indirectly, but one step) to a text file. Install OCR software such as gocr, then in xsane set the preferences to use gocr as the OCR/text client. It was set to that by default on mine. Then change the file type to be saved as text, and scan. You should get a text file wherever you pointed the output. Xsane scans to a temporary image, then pipes it through gocr, which does the OCR work and saves the text file.
You may need to tweak gocr to get usable text output. Other OCR software may work, you just need to tell xsane to use it, in the preferences.
- 02-11-2011 #7
The only way I get acceptable results with gocr is by scanning to a .pdf or .tiff to the viewer, then running the OCR program from the viewer. For whatever reason, saving directly as text gives me garbage. I haven't really played with it that much, just a very quick test for this thread.