Results 1 to 9 of 9
Hi,
I use grep every day, but today I need to do something I am not sure how to do.
I have grepped a log file to obtain every line ...
- 12-08-2010 #1Just Joined!
- Join Date
- Feb 2008
- Posts
- 15
Bash script - text manipulation
Hi,
I use grep every day, but today I need to do something I am not sure how to do.
I have grepped a log file to obtain every line that contains a word (let's call it 'blah').
I now want to only display a list of entries within that search result that feature the word 'host' anywhere in the line, and I also want to display the single word *after* 'host' (up until the next space).
So, the end result will look like this:
host sfsfdfdf.sdf
host erereef.csdfdf.erer
host dwrerer.serre
host ffgfgg.sdsd
So, I will be doing
grep blah myfile |
I just don't know what to put after the pipe.
Thank you!
- 12-09-2010 #2
Show us a sample of the input data. You can: (for starters)
and get it narrowed down to available lines for further processing but again, we need to see what the input data looks like.Code:grep blah some_file | grep host
- 12-09-2010 #3Just Joined!
- Join Date
- Feb 2008
- Posts
- 15
Hi Barriehie,
here is a sample of the data. Every line will be containing the word 'host' so I don't need to grep for that. I've put two lines below (they are quite long).
Basically, I just want to extract the following from them:
host aaabbb.aaaachina.com
host mail.eerrr.net
2010-12-08 23:44:39 1PQTem-0007Cs-Ll SMTP error from remote mail server after MAIL FROM:<aaaa@bbbbb.com.au> SIZE=9105467: host aaabbb.aaaachina.com [123.123.123.123]: 451 MI:RBL dul.dnsbl.sorbs.net pppp.cn/service/faq/youx/mailsy/200905/3781.html [123.123.123.123]
2010-12-09 14:05:29 1PQboj-00067o-N8 == qqqwww@rrrttt.com R=dnslookup T=remote_smtp defer (-45): SMTP error from remote mail server after MAIL FROM:<qqqwww@rrrttt.com>: host mail.eerrr.net [119.235.18.63]: 476 ERR_TMP_SENDER_REJECT_BLACKLIST(476) - Sorry, your message is temporarily rejected because sender is currently on one of the blacklists - blacklist=RBL/dul.dnsbl.sorbs.net, ip=123.123.123.123
- 12-09-2010 #4
What do you want to accomplish? Do you just want to show the plain matches without any context? Then use parameter -o (or --only-matching).
To also display the contents after "host" up to the next space use regular expressions:Read more about grep and regular expressions here: Refining Linux: #8: Advanced usage of grep and of course the man page for grep.Code:echo -e "host example.com foo\nbar host example.net" | grep -oP 'host \S+'
Refining Linux Advent calendar: “24 Outstanding ZSH Gems”
- 12-09-2010 #5
My grep isn't compiled with the P switch as Manko10 indicates so here's plan B.
I used the two lines you've provided for myfile.dat.Code:$ > sed -e 's/^.* host/host/' ./myfile.dat | gawk '{ if($0~/^host.*$/) {print $1, $2} }' host aaabbb.aaaachina.com host mail.eerrr.net $ >
- 12-09-2010 #6
If your grep is not compiled with PCRE support, you can also use normal extended regular expression:
or basic regular expression:Code:echo -e 'host example.com foo\nbar host example.net' | grep -oE 'host [^[:space:]]+'
PCREs are just more convenient.Code:echo -e 'host example.com foo\nbar host example.net' | grep -o 'host [^[:space:]]\+'
Refining Linux Advent calendar: “24 Outstanding ZSH Gems”
- 12-09-2010 #7
So for the case given:
Pretty cool. I guess I'll have to add more studies of grep in between perl, bash, sed, ...Code:$ > grep -oE 'host [^[:space:]]+' ./myfile.dat host aaabbb.aaaachina.com host mail.eerrr.net $ >
- 12-12-2010 #8Just Joined!
- Join Date
- Feb 2008
- Posts
- 15
Thanks guys for your help. I'd never even heard of PCRE regular expressions, but they seem to be a really clean way of going about things.
- 12-12-2010 #9Uh… that's a tautology.PCRE regular expressions

PCREs just stands for Perl Compatible Regular Expressions, i.e., regular expressions as they are used in Perl (and PHP, Apache, Python etc.).Refining Linux Advent calendar: “24 Outstanding ZSH Gems”


Reply With Quote