Find the answer to your Linux question:
Results 1 to 3 of 3
when i use this command it takes 2 second to find right file (i have 30 files under this directory) I want fast search by using any command per your ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined!
    Join Date
    Feb 2013
    Posts
    2

    Thumbs up faster search inside files


    when i use this command it takes 2 second to find right file (i have 30 files under this directory)
    I want fast search by using any command per your recommendation. I use "find" and "grep" so far and it's not fast enough because php function use grep to identify the word on result page. Is there any blazing fast system where i can use any command or software i can install anything on my server as long as it works fast i also tried ramdisk but no difference any suggestion from linux gurus highly appriciated.

    (need to be exact match only -Rc works in 1 sec however doesn't come with exact match "--word regexp" catch exact match but it takes 2-3 second to result i been trying to solve this problem for 3 days


    grep -Rc --word-regexp 'departmento'

    /language/latest/swe.txt:0
    /language/latest/ltz.txt:0
    /language/latest/liv.txt:0
    /language/latest/bos.txt:0
    /language/latest/spa.txt:1
    /language/latest/azj.txt:0
    /language/latest/afr.txt:0
    /language/latest/als.txt:0
    /language/latest/cym.txt:0
    /language/latest/oci.txt:0



    grep -Rc --word-regexp 'department'

    /language/latest/swe.txt:0
    /language/latest/ltz.txt:0
    /language/latest/liv.txt:0
    /language/latest/bos.txt:0
    /language/latest/spa.txt:0
    /language/latest/azj.txt:0
    /language/latest/afr.txt:0
    /language/latest/als.txt:0
    /language/latest/cym.txt:0
    /language/latest/oci.txt:0

  2. #2
    drl
    drl is online now
    Linux Engineer drl's Avatar
    Join Date
    Apr 2006
    Location
    Saint Paul, MN, USA / CentOS, Debian, Slackware, {Free, Open, Net}BSD, Solaris
    Posts
    1,258
    Hi.

    Welcome to the forum.

    Of the tools I have tested, the grep family is the fastest of the easily usable tools for single-shot searching. See report below.

    If you are doing this many times on static data, then an index might make sense. One such tool is sary: a suffix array library and tools -- It was in the Debian repositories, but can be downloaded from sourceforge. The downside is that the index creation is a separate process that needs to be run after any change is made.

    In turn, sary refers to Namazu: a Full-Text Search Engine for large quantities of files, however, I have not yet used that.

    Best wishes ... cheers, drl
    Code:
    Comparison of code & languages for searching text files
    Sun Feb 24 06:04:55 CST 2013
    
    Data file is 100 copies of the novel Moby Dick.  The string
    to be searched for is "Fleece" which occurs 11 times in the
    novel, hence the output is checked for 1100 lines.
    
    Notable: "wc" does no searching, just counting, run as a kind of
    baseline. It is run first so that any cache delays will be borne by it.
    
    "grepp" is "grep -P Fleece", using the perl expression option.
    
    
     File /tmp/100-mb.txt: 1,777,700 lines, 120,540,400 bytes
    
    Alphabetically:
            code   cpu   real system real/cpu cpu/best real/best sys/best
             ack  7.71   8.60   0.15     1.12    59.31     37.39     5.00
           agrep  0.19   0.29   0.08     1.53     1.46      1.26     2.67
             awk  1.34   1.48   0.07     1.10    10.31      6.43     2.33
        bash (1) 90.30 100.10   4.50     1.11   694.62    435.22   150.00
           cgrep  0.16   0.31   0.12     1.94     1.23      1.35     4.00
            csed  2.60   2.73   0.08     1.05    20.00     11.87     2.67
       fgrep (2)  0.18   0.25   0.04     1.39     1.38      1.09     1.33
    gfortran (2)  5.38   7.88   0.07     1.46    41.38     34.26     2.33
           glark 18.92  22.34   2.20     1.18   145.54     97.13    73.33
            grep  0.18   0.24   0.05     1.33     1.38      1.04     1.67
           grepp  0.52   0.66   0.09     1.27     4.00      2.87     3.00
        Icon (1) 24.56  25.64   0.46     1.04   188.95    111.48    15.38
            java  2.16   2.56   0.15     1.19    16.62     11.13     5.00
         ksh (1) 17.57  18.87   0.06     1.07   135.12     82.03     1.89
             Lua  3.25   3.49   0.12     1.07    25.00     15.17     4.00
            mawk  0.51   0.64   0.10     1.25     3.92      2.78     3.33
            perl  1.41   1.56   0.09     1.11    10.85      6.78     3.00
             PHP  6.24  12.53   0.09     2.01    48.00     54.48     3.00
          Python  6.40   9.28   0.14     1.45    49.23     40.35     4.67
        rexx (2)  6.50   8.27   0.08     1.27    50.00     35.96     2.67
            ruby  2.00   2.19   0.13     1.09    15.38      9.52     4.33
             sed  1.43   1.57   0.09     1.10    11.00      6.83     3.00
           tclsh 14.14  14.66   0.08     1.04   108.77     63.74     2.67
              wc  0.13   0.23   0.08     1.77     1.00      1.00     2.67
         zsh (1) 66.40 200.00 129.40     3.01   510.77    869.57  4313.33
    
    By cpu:
            code   cpu   real system real/cpu cpu/best real/best sys/best
              wc  0.13   0.23   0.08     1.77     1.00      1.00     2.67
           cgrep  0.16   0.31   0.12     1.94     1.23      1.35     4.00
       fgrep (2)  0.18   0.25   0.04     1.39     1.38      1.09     1.33
            grep  0.18   0.24   0.05     1.33     1.38      1.04     1.67
           agrep  0.19   0.29   0.08     1.53     1.46      1.26     2.67
            mawk  0.51   0.64   0.10     1.25     3.92      2.78     3.33
           grepp  0.52   0.66   0.09     1.27     4.00      2.87     3.00
             awk  1.34   1.48   0.07     1.10    10.31      6.43     2.33
            perl  1.41   1.56   0.09     1.11    10.85      6.78     3.00
             sed  1.43   1.57   0.09     1.10    11.00      6.83     3.00
            ruby  2.00   2.19   0.13     1.09    15.38      9.52     4.33
            java  2.16   2.56   0.15     1.19    16.62     11.13     5.00
            csed  2.60   2.73   0.08     1.05    20.00     11.87     2.67
             Lua  3.25   3.49   0.12     1.07    25.00     15.17     4.00
    gfortran (2)  5.38   7.88   0.07     1.46    41.38     34.26     2.33
             PHP  6.24  12.53   0.09     2.01    48.00     54.48     3.00
          Python  6.40   9.28   0.14     1.45    49.23     40.35     4.67
        rexx (2)  6.50   8.27   0.08     1.27    50.00     35.96     2.67
             ack  7.71   8.60   0.15     1.12    59.31     37.39     5.00
           tclsh 14.14  14.66   0.08     1.04   108.77     63.74     2.67
         ksh (1) 17.57  18.87   0.06     1.07   135.12     82.03     1.89
           glark 18.92  22.34   2.20     1.18   145.54     97.13    73.33
        Icon (1) 24.56  25.64   0.46     1.04   188.95    111.48    15.38
         zsh (1) 66.40 200.00 129.40     3.01   510.77    869.57  4313.33
        bash (1) 90.30 100.10   4.50     1.11   694.62    435.22   150.00
    
    
    Top 5 by real, then real/cpu:
            code   cpu   real system real/cpu cpu/best real/best sys/best
         zsh (1) 66.40 200.00 129.40     3.01   510.77    869.57  4313.33
        bash (1) 90.30 100.10   4.50     1.11   694.62    435.22   150.00
        Icon (1) 24.56  25.64   0.46     1.04   188.95    111.48    15.38
           glark 18.92  22.34   2.20     1.18   145.54     97.13    73.33
         ksh (1) 17.57  18.87   0.06     1.07   135.12     82.03     1.89
    
    
    (1) code ran into a time limit; results are extrapolations.
    (2) code did not use regular expressions.
    
     Statistics for cpu:
    minimum   sum       cases   mean          median     range   std-dev   maximum
    0.13      280       25      11.2             2.6      90.2      21.5      90.3
    Welcome - get the most out of the forum by reading forum basics and guidelines: click here.
    90% of questions can be answered by using man pages, Quick Search, Advanced Search, Google search, Wikipedia.
    We look forward to helping you with the challenge of the other 10%.
    ( Mn, 2.6.n, AMD-64 3000+, ASUS A8V Deluxe, 1 GB, SATA + IDE, Matrox G400 AGP )

  3. #3
    Just Joined!
    Join Date
    Feb 2013
    Posts
    2
    thank you very much for your response i will investigate the link you gave

    i found -P very fast but still far from my expectation

    [root@server1 latest]# time find ./ -name "*.ext" -print0 | grep -PRc --word-regexp 'direzione' /language/latest
    /language/latest/swe.txt:0
    /language/latest/ltz.txt:0
    /language/latest/lit.txt:0
    /language/latest/bos.txt:0
    /language/latest/lav.txt:0
    /language/latest/som.txt:0
    /language/latest/deu.txt:0
    /language/latest/spa.txt:0
    /language/latest/azj.txt:0
    /language/latest/ita.txt:1

    real 0m0.225s
    user 0m0.180s
    sys 0m0.044s



    some of them suggest me to make 8 instance pipe by using args but not good so far


    [root@server1 latest]# time find ./ -name "*.ext" -print0 | xargs -0 -n1 -P8 grep -PRc --word-regexp 'direzione' /language/latest
    /language/latest/swe.txt:0
    /language/latest/ltz.txt:0
    /language/latest/lit.txt:0
    /language/latest/bos.txt:0
    /language/latest/lav.txt:0
    /language/latest/som.txt:0
    /language/latest/deu.txt:0
    /language/latest/ita.txt:1
    real 0m0.227s
    user 0m0.186s
    sys 0m0.040s

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •