Find the answer to your Linux question:
Results 1 to 3 of 3
His, i have a text file, produced by pdfinfo, that sometimes has "special" characters and then an editor like gedit or medit complains: "Could not detect file character encoding." jEdit ...
  1. #1
    Just Joined! clickalot's Avatar
    Join Date
    Nov 2009
    Location
    Erlangen, Germany
    Posts
    37

    filter special characters

    His,
    i have a text file, produced by pdfinfo, that sometimes has "special" characters and then an editor like gedit or medit complains: "Could not detect file character encoding."

    jEdit can open the file, so I'm not totally in the "cold"...

    if i use the cat -v command i can see that one of the special characters that is making a problem is ^@ (i think its the null from ASCII, or \0)

    but this doesn't solve my problem of translating the file because useful characters like the Ü get translated into M-CM-^\

    i also tried:
    Code:
    $ iconv -c file.txt -o out.txt
    and:
    Code:
    iconv -c -f ISO8859-1 file.txt -t UTF-8 -o out.txt
    and:
    Code:
    dos2unix -bv file.txt
    but these also didn't work out.

    How do I get rid of, or filter, special characters??

  2. #2
    tpl
    tpl is offline
    Linux User
    Join Date
    Jan 2007
    Location
    cleveland
    Posts
    452
    suggest you try "tr"

    among other things, given a string of characters,
    it will delete them from a file

    tr -d "t w" <file

    deletes each 't' and each 'w' from the file. Special
    characters may also be represented in octal
    the sun is new every day (heraclitus)

  3. #3
    Just Joined! clickalot's Avatar
    Join Date
    Nov 2009
    Location
    Erlangen, Germany
    Posts
    37

    Smile

    thanks tpl!

    that did the trick in my case:
    Code:
    cat bad_file.txt | tr -d "\0" > filtered.txt
    i still wish linux had a more general tool capable of handling this problem...

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
...