Welcome to Linux Forums! With a comprehensive Linux Forum, information on various types of Linux software and many Linux Reviews articles, we have all the knowledge you need a click away, or accessible via our knowledgeable members.
Write an article for LinuxForums Today! Win Great Prizes!
I have been trying for 2 days to solve this problem, so I am ready to ask for help.
Background: I am trying to move my CD collection into iTunes, and ultimately to my
iPod. When the titles are looked up in the Gracenotes database, some titles,
particularly classical titles, come back with embedded non-printing characters. I am
seeing octal 357, 200, and 242 using OD. I would like to strip these out, as they
interfere with some copy tools I use.
I don't think this problem can be solved in Windows, and OSx refuses to touch my
NTFS file system except for reading. So I am using Suse 11 which has no such qualms.
I can iterate over the file system, I can write the rename portion & submit it to a
shell. I need to be able to grep for an octal pattern. I understand grep accepts a
regex, but I can't seem to get the syntax right. It would be something like
" ... | grep '\357' | ..... etc. This doesn't work, BTW.
If you have ever done this, or just know how to do it, I would appreciate a heads up. As a side exercise, I would like to print the character corresponding to an
octal pattern on the screen. Something like "printf "%c" \357 ", but one that works.
Any help is much appreciated. Feel free to email me direct.
Location: I can be found either 40 miles west of Chicago, or in a galaxy far, far away.
Posts: 2,528
For octal values, you need to preceed the actual data value with a zero (0), so for octal 354 you would use \0354. Try that - no guarantees, but it works in C/C++. Also, if this is on the command line, the backslash is an escape character, so you may need to double it up, as in "\\0354".
__________________
Sometimes, real fast is almost as good as real time.
An alternative, for the record, might be to simply grep for anything that is not a valid character. So you might try:
Code:
egrep '[^[:alnum:][:punct:][:space:]]'
egrep is the same as grep -E, which enables POSIX Extended Regular Expressions. I then create a character class that matches anything that is not alphanumeric, a punctuation, or a space.
This way, you don't need to worry about matching something specific, but instead can worry about only preserving certain characters.
As for your other question, to print in Bash a character knowing only its octal value, you use this special notation:
Code:
echo $'\nnn'
So for instance:
Code:
alex@danu ~ $ echo $'\101'
A
I hope that helps!
__________________
DISTRO=Gentoo
Registered Linux User #388732
Gentoo Linux, 410 GB HD, 1.2 GB RAM, Fluxbox, These are a Few of my Favorite Things
Open Source Security Myths Dispelled Dispel the five major myths surrounding Open Source Security and gain the tools necessary to make a truly informed decision for your IT organization subscribe
InformationWeek InformationWeek is the only newsweekly you'll need to stay on top of the latest developments in information technology. subscribe