Find the answer to your Linux question:
Results 1 to 9 of 9
Hi, I use grep every day, but today I need to do something I am not sure how to do. I have grepped a log file to obtain every line ...
  1. #1
    Just Joined!
    Join Date
    Feb 2008
    Posts
    15

    Bash script - text manipulation

    Hi,
    I use grep every day, but today I need to do something I am not sure how to do.

    I have grepped a log file to obtain every line that contains a word (let's call it 'blah').
    I now want to only display a list of entries within that search result that feature the word 'host' anywhere in the line, and I also want to display the single word *after* 'host' (up until the next space).

    So, the end result will look like this:
    host sfsfdfdf.sdf
    host erereef.csdfdf.erer
    host dwrerer.serre
    host ffgfgg.sdsd

    So, I will be doing
    grep blah myfile |

    I just don't know what to put after the pipe.
    Thank you!

  2. #2
    Just Joined! barriehie's Avatar
    Join Date
    Apr 2008
    Posts
    81
    Show us a sample of the input data. You can: (for starters)
    Code:
    grep blah some_file | grep host
    and get it narrowed down to available lines for further processing but again, we need to see what the input data looks like.

  3. #3
    Just Joined!
    Join Date
    Feb 2008
    Posts
    15
    Hi Barriehie,
    here is a sample of the data. Every line will be containing the word 'host' so I don't need to grep for that. I've put two lines below (they are quite long).

    Basically, I just want to extract the following from them:
    host aaabbb.aaaachina.com
    host mail.eerrr.net


    2010-12-08 23:44:39 1PQTem-0007Cs-Ll SMTP error from remote mail server after MAIL FROM:<aaaa@bbbbb.com.au> SIZE=9105467: host aaabbb.aaaachina.com [123.123.123.123]: 451 MI:RBL dul.dnsbl.sorbs.net pppp.cn/service/faq/youx/mailsy/200905/3781.html [123.123.123.123]

    2010-12-09 14:05:29 1PQboj-00067o-N8 == qqqwww@rrrttt.com R=dnslookup T=remote_smtp defer (-45): SMTP error from remote mail server after MAIL FROM:<qqqwww@rrrttt.com>: host mail.eerrr.net [119.235.18.63]: 476 ERR_TMP_SENDER_REJECT_BLACKLIST(476) - Sorry, your message is temporarily rejected because sender is currently on one of the blacklists - blacklist=RBL/dul.dnsbl.sorbs.net, ip=123.123.123.123

  4. #4
    Linux User Manko10's Avatar
    Join Date
    Sep 2010
    Posts
    250
    What do you want to accomplish? Do you just want to show the plain matches without any context? Then use parameter -o (or --only-matching).
    To also display the contents after "host" up to the next space use regular expressions:
    Code:
    echo -e "host example.com foo\nbar host example.net" | grep -oP 'host \S+'
    Read more about grep and regular expressions here: Refining Linux: #8: Advanced usage of grep and of course the man page for grep.
    Refining Linux Advent calendar: “24 Outstanding ZSH Gems

  5. #5
    Just Joined! barriehie's Avatar
    Join Date
    Apr 2008
    Posts
    81
    My grep isn't compiled with the P switch as Manko10 indicates so here's plan B.
    Code:
    $ > sed -e 's/^.* host/host/' ./myfile.dat | gawk '{ if($0~/^host.*$/) {print $1, $2} }'
    host aaabbb.aaaachina.com
    host mail.eerrr.net
    $ >
    I used the two lines you've provided for myfile.dat.

  6. #6
    Linux User Manko10's Avatar
    Join Date
    Sep 2010
    Posts
    250
    If your grep is not compiled with PCRE support, you can also use normal extended regular expression:
    Code:
    echo -e 'host example.com foo\nbar host example.net' | grep -oE 'host [^[:space:]]+'
    or basic regular expression:
    Code:
    echo -e 'host example.com foo\nbar host example.net' | grep -o 'host [^[:space:]]\+'
    PCREs are just more convenient.
    Refining Linux Advent calendar: “24 Outstanding ZSH Gems

  7. #7
    Just Joined! barriehie's Avatar
    Join Date
    Apr 2008
    Posts
    81
    So for the case given:
    Code:
    $ > grep -oE 'host [^[:space:]]+' ./myfile.dat
    host aaabbb.aaaachina.com
    host mail.eerrr.net
    $ >
    Pretty cool. I guess I'll have to add more studies of grep in between perl, bash, sed, ...

  8. #8
    Just Joined!
    Join Date
    Feb 2008
    Posts
    15
    Thanks guys for your help. I'd never even heard of PCRE regular expressions, but they seem to be a really clean way of going about things.

  9. #9
    Linux User Manko10's Avatar
    Join Date
    Sep 2010
    Posts
    250
    PCRE regular expressions
    Uh… that's a tautology.
    PCREs just stands for Perl Compatible Regular Expressions, i.e., regular expressions as they are used in Perl (and PHP, Apache, Python etc.).
    Refining Linux Advent calendar: “24 Outstanding ZSH Gems

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
...