Find the answer to your Linux question:
Results 1 to 7 of 7
Neither my Simple Scan nor the Xsane image scan facility appears to be able to scan a typed document to editable text. Is there software available to enable this facility ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined!
    Join Date
    Aug 2010
    Posts
    76

    How to scan to text file


    Neither my Simple Scan nor the Xsane image scan facility appears to be able to scan a typed document to editable text. Is there software available to enable this facility on Ubuntu 10.10 using my HP Officejet 6313?

  2. #2
    Linux Guru reed9's Avatar
    Join Date
    Feb 2009
    Location
    Boston, MA
    Posts
    4,651
    Look for optical character recognition software. I'm not sure all of what's out there. I know for GNOME there is OCRFeeder. I don't know if you can scan directly to text, but you should be able to convert your scanned doc to text with it.

  3. #3
    Linux Newbie Charles4809's Avatar
    Join Date
    Nov 2008
    Location
    Utrecht, NL
    Posts
    138
    Look for Tesseract, seems to be quite good. Needs a Tif image to convert to text.
    Charles
    ASUS EEE Box B202, Atom 270 1,6GHz, 1 GB, HDD 80GB, XP-SP3 / PinguyOS
    Asus EEE PC 901 with Bodhi-Linux

  4. #4
    Just Joined!
    Join Date
    Aug 2010
    Posts
    76
    Many thanks. I managed to get limited text recognition via gscan2pdf but the OCR feeder works much better, only I have to first scan to pdf. It's a bit of a schlep compared to the MS Windows/HP version but it works.

  5. #5
    Just Joined! supermanisdeady's Avatar
    Join Date
    Dec 2010
    Posts
    33
    change the file into pdf and try scan text area and copy to clipboard with gwenview(pdf viewer for KDE)...

  6. #6
    Linux User sgosnell's Avatar
    Join Date
    Oct 2010
    Location
    Baja Oklahoma
    Posts
    469
    Xsane can scan directly (well, indirectly, but one step) to a text file. Install OCR software such as gocr, then in xsane set the preferences to use gocr as the OCR/text client. It was set to that by default on mine. Then change the file type to be saved as text, and scan. You should get a text file wherever you pointed the output. Xsane scans to a temporary image, then pipes it through gocr, which does the OCR work and saves the text file.

    You may need to tweak gocr to get usable text output. Other OCR software may work, you just need to tell xsane to use it, in the preferences.

  7. #7
    Linux User sgosnell's Avatar
    Join Date
    Oct 2010
    Location
    Baja Oklahoma
    Posts
    469
    The only way I get acceptable results with gocr is by scanning to a .pdf or .tiff to the viewer, then running the OCR program from the viewer. For whatever reason, saving directly as text gives me garbage. I haven't really played with it that much, just a very quick test for this thread.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •