Find the answer to your Linux question:
Results 1 to 6 of 6
I'm working on a database, and I'm running into a little problem. I've been searching the Internet for days without result. That is, I've felt I've come close many times, ...
  1. #1
    Linux Engineer Freston's Avatar
    Join Date
    Mar 2007
    Location
    The Netherlands
    Posts
    1,047

    grep undefined # of $var

    I'm working on a database, and I'm running into a little problem. I've been searching the Internet for days without result. That is, I've felt I've come close many times, and I know what I want is possible, yet I just don't seem to cut it.

    I want to be able to feed an undefined number of variables into grep and/or sed. Of course looping is an obvious answer, but grep and sed interpret my search as an 'or' search. And I need an 'and' search.

    Code:
    Example database:
    1) testword1 testword2 text1 text2 001 002 003
    2) testword2 text3 003 004 005
    3) testword1 testword2 text3
    (...)
    ad infinitum
    Now what I do with grep looping it's search through the input "testword1 text3" yields both line 1, line 2 and line 3, whereas I want it to result only in line 3.

    I now have it set up to work with a max of two variables, but that is not always enough. Sometimes I need more to discriminate enough between the lines.

    Code:
    # My current workaround
    if [ ! $SEARCHWORD2 ] ; then
    grep $SEARCHWORD1 database
    else
    grep $SEARCHWORD1 database | grep $SEARCHWORD2
    fi
    Now I'm probably overlooking the obvious, but I can't seem to get the thing to work.




    So my questions:
    1) Can I catch all the input in a single variable? Or do I need to catch them separately?
    Code:
    Example:
    read SEARCHWORDS
    -or-
    read SRCHWRD1 SRCHWRD2 SRCHWRD3 SRCHWRD4
    -alternative-
    use 'cut' to separate $SEARCHWORD into it's components or use 'sed' to change spaces to delimiters.
    2) Which is better to use in this case, 'grep' or 'sed'?

    3) How do I get 'grep' and/or 'sed' to loop through these keywords and come up with the line in database that has all these words in it, instead of all the lines that have one of these words in it.

    Again, I'm overlooking something obvious, so I take a deep bow for your patience and reading.

  2. #2
    drl
    drl is offline
    Linux Engineer drl's Avatar
    Join Date
    Apr 2006
    Location
    Saint Paul, MN, USA / CentOS, Debian, Solaris, SuSE
    Posts
    1,117
    Hi.

    When I need something like this, I reach for a script that was published in Advanced perl, from O'Reilly. Speaking of advanced: advance apology for the long post. The perl script is:
    Code:
    #!/usr/bin/perl -w
    
    # @(#) mgrep    Perform multiple character regular expression matches (any order).
    # $Id: mgrep 189 2006-09-03 11:38:51Z drl $
    
    # drl / 97.09
    #
    # mgrep [ -p] pat1 pat2 pat3 ... patn -- file1 file2 ... filen
    # -p:   preserve case; if not specified case is ignored.
    #
    # Based on Advanced perl Programming, p74ff.
    
    $program = `basename $0`; chop($program);
    $usage = "usage: $program [ -p] pattern1 [ pattern2 ... patternn ] [ -- files ]\n";
    $debug = 1;
    $debug = 0;
    @patterns = ();
    $case = "i";    # case is ignored
    
    # Process arguments as patterns.
    
    unless ( @ARGV ) {
            die($usage);
    } else {
            while ( $arg = shift(@ARGV) ) {
                    print "debug: argument :$arg:\n" if $debug;
                    if ( $arg =~ /^--/ ) {
                            print "debug: found --\n" if $debug;
                            last;
                    } elsif ( $arg =~ /^-p/ ) {
                            $case = "";     # Do NOT ignore case
                            next;
                    }
                    push(@patterns,$arg);
            }
    }
    
    unless ( @patterns ) {
            die($usage);
    }
    
    $code = 'while ( <> ) {';
    $code .= 'if (/';
    $code .= join('/' . $case . ' && /', @patterns);
    $code .= '/' . $case . ') {print$_}}';
    print "code = $code\n" if $debug;
    eval $code;     # compile and execute
    
    # Check for bad regular expressions.
    
    die("Error ---: $@\n Code: \n$code\n") if ( $@ );
    
    exit(0);
    and, a shell script that uses your sample data in file data1:
    Code:
    #!/bin/sh
    
    # @(#) s1       Demonstrate mgrep.
    
    echo " sh version: $BASH_VERSION"
    
    FILE=${1-data1}
    
    echo
    ./mgrep testword1 text3 -- $FILE
    
    exit 0
    produces:
    Code:
    % ./s1
     sh version: 2.05b.0(1)-release
    
    3) testword1 testword2 text3
    So, keep in mind that the shell script isn't necessary, it's just easier for demonstration purposes. The shell script calls the perl script which creates a little perl fragment based on your pattern-string arguments.

    It's not trivial, but it sure is useful. I hope it works as well for you as it has for me on those odd occasions when you really, really need it ... cheers, drl
    Welcome - get the most out of the forum by reading forum basics and guidelines: click here.
    90% of questions can be answered by using man pages, Quick Search, Advanced Search, Google search, Wikipedia.
    We look forward to helping you with the challenge of the other 10%.
    ( Mn, 2.6.n, AMD-64 3000+, ASUS A8V Deluxe, 1 GB, SATA + IDE, Matrox G400 AGP )

  3. #3
    Linux Engineer Freston's Avatar
    Join Date
    Mar 2007
    Location
    The Netherlands
    Posts
    1,047
    Ok, that was not the answer I expected. But wow, ok, thanks. I found myself studying Perl. You know, I've wanted to learn Perl, but I was expecting to do so after I've finished this script

    It's actually not so difficult if you understand the syntax. I actually think this is something I can use. Cheers to you to! Thanks!

  4. #4
    Linux User
    Join Date
    Aug 2006
    Posts
    458
    no perl, just awk..minimally, below works..but not for undefined number of search terms
    Code:
    awk '/testword1/ && /text3/ { print }' "file"

  5. #5
    drl
    drl is offline
    Linux Engineer drl's Avatar
    Join Date
    Apr 2006
    Location
    Saint Paul, MN, USA / CentOS, Debian, Solaris, SuSE
    Posts
    1,117
    Hi.

    If you require the strings to be in order, then you can use this:
    Code:
    #!/bin/sh
    
    # @(#) spgrep   Search for several patterns in order.
    
    # spgrep pattern1 pattern2 ... patternn -- file1 file2 ... filen
    
    debug="echo"
    debug=":"
    
    p=""
    first=true
    
    for arg
    do
      if [ "$arg" = "--" ]
      then
        $debug " found --, shifting and breaking."
        shift
        break
      else
        if [ -z "$p" ]
        then
          p="$arg"
        else
          p="$p.*$arg"
        fi
      fi
      shift
    done
    
    $debug " pattern = :$p:"
    $debug " remaining arguments = :$*:"
    
    for file
    do
      $debug " searching $file for :$p:"
      grep -H "$p" $file
    done
    
    exit 0
    This uses the same basic sequence as mgrep (above). It builds a regular expression for grep from your strings, then processes the files. You can do all the files on a single grep if that's what you like, and you can remove the "-H" if you don't want the filename displayed, etc. Interchange the debug= statements to enable intermediate values.

    With a little bit of extra work, you can use the same technique to build an awk line ... cheers, drl
    Welcome - get the most out of the forum by reading forum basics and guidelines: click here.
    90% of questions can be answered by using man pages, Quick Search, Advanced Search, Google search, Wikipedia.
    We look forward to helping you with the challenge of the other 10%.
    ( Mn, 2.6.n, AMD-64 3000+, ASUS A8V Deluxe, 1 GB, SATA + IDE, Matrox G400 AGP )

  6. #6
    Linux Engineer Freston's Avatar
    Join Date
    Mar 2007
    Location
    The Netherlands
    Posts
    1,047
    I'm sorry, it might be my blunt understanding...


    But I have no idea how this last script could work. How I read this, the loop takes patern1 patern2 patern3 and changes it to patern1.patern2.patern3. And I have no idea how that could be fed into grep and come up with something useful.

    Am I misinterpreting something? Anyone? I'm

    ----

    Oh and the Perl script is a bit 'not-ready'. It needs some work. But that's fine, I was gonna delve deeper into Perl anyway. This gives me probable cause. Don't post links! I've got hours of reading ahead of me.

    But if there are Perl n00bs out there who wish to tackle n00bish problems to sharpen their wit, I am eager to participate

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
...