Results 1 to 3 of 3
Gentlemen,
I have been trying for 2 days to solve this problem, so I am ready to ask for help.
Background: I am trying to move my CD collection into ...
- 06-16-2009 #1Just Joined!
- Join Date
- Oct 2006
- Posts
- 3
Grepping for octal numbers
Gentlemen,
I have been trying for 2 days to solve this problem, so I am ready to ask for help.
Background: I am trying to move my CD collection into iTunes, and ultimately to my
iPod. When the titles are looked up in the Gracenotes database, some titles,
particularly classical titles, come back with embedded non-printing characters. I am
seeing octal 357, 200, and 242 using OD. I would like to strip these out, as they
interfere with some copy tools I use.
I don't think this problem can be solved in Windows, and OSx refuses to touch my
NTFS file system except for reading. So I am using Suse 11 which has no such qualms.
I can iterate over the file system, I can write the rename portion & submit it to a
shell. I need to be able to grep for an octal pattern. I understand grep accepts a
regex, but I can't seem to get the syntax right. It would be something like
" ... | grep '\357' | ..... etc. This doesn't work, BTW.
If you have ever done this, or just know how to do it, I would appreciate a heads up. As a side exercise, I would like to print the character corresponding to an
octal pattern on the screen. Something like "printf "%c" \357 ", but one that works.
Any help is much appreciated. Feel free to email me direct.
Thanks,
Greg
- 06-17-2009 #2Linux Guru
- Join Date
- Apr 2009
- Location
- I can be found either 40 miles west of Chicago, or in a galaxy far, far away.
- Posts
- 8,961
For octal values, you need to preceed the actual data value with a zero (0), so for octal 354 you would use \0354. Try that - no guarantees, but it works in C/C++. Also, if this is on the command line, the backslash is an escape character, so you may need to double it up, as in "\\0354".
Sometimes, real fast is almost as good as real time.
Just remember, Semper Gumbi - always be flexible!
- 06-17-2009 #3
An alternative, for the record, might be to simply grep for anything that is not a valid character. So you might try:
egrep is the same as grep -E, which enables POSIX Extended Regular Expressions. I then create a character class that matches anything that is not alphanumeric, a punctuation, or a space.Code:egrep '[^[:alnum:][:punct:][:space:]]'
This way, you don't need to worry about matching something specific, but instead can worry about only preserving certain characters.
As for your other question, to print in Bash a character knowing only its octal value, you use this special notation:
So for instance:Code:echo $'\nnn'
I hope that helps!Code:alex@danu ~ $ echo $'\101' A
DISTRO=Arch
Registered Linux User #388732


Reply With Quote
