Find the answer to your Linux question:
Results 1 to 3 of 3
Hi all, I'm new here and I hope someone could help me find a better solution to my problem. I need to parse the log files (compressed) looking for certain ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined!
    Join Date
    Aug 2012
    Posts
    2

    grep -f is very slow


    Hi all,
    I'm new here and I hope someone could help me find a better solution to my problem.
    I need to parse the log files (compressed) looking for certain patterns. I use the following one-liner:

    Code:
    zgrep -h -f patterns.txt logs*.gz
    The problem is that's it's very slow. The compressed logs are 180GB large, and at the rate I'm seeing, it will take 3 weeks to complete the search.

    There are about 30 patterns in the patterns.txt file. A zgrep for one pattern (without -f option) completes in about 2 hours, but when using -f option it will take much more than 30 times longer.

    Is there a better solution to this problem?

    zgrep did not allow me to use -F (fixed strings option), maybe it would run faster if I used grep -F?

    Running on RHEL5.6

  2. #2
    Linux Newbie hagfish52's Avatar
    Join Date
    Dec 2011
    Location
    Asheville, NC
    Posts
    225
    You may get faster results by limiting your search with "head" or "tail", or by filtering by date. This webpage goes into pretty exhaustive detail about searching large log files: 10 Awesome Examples for Viewing Huge Log Files in Unix

  3. #3
    Just Joined!
    Join Date
    Aug 2012
    Posts
    2

    Lightbulb Use zfgrep

    I found the solution to this. Initially I got the following error when running zgrep -F (fixed strings):

    Code:
    $ zgrep -h -F -f patterns.txt logs*.gz >/tmp/result.txt
    egrep: conflicting matchers specified
    Aparently zgrep, by default tries to use egrep which does not like the -F option. But, there is a zfgrep command too. It uses fgrep by default:

    Code:
    $ zfgrep -h -f patterns.txt logs*.gz >/tmp/result.txt
    With this, my log search finishes in under an hour, instead of a few weeks

    BTW: I could not reproduce that error message on my home computer (Ubuntu 12.04). Ubuntu has a different zgrep version than RHEL, which accepts -F option happily.

  4. $spacer_open
    $spacer_close

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •