Find the answer to your Linux question:
Results 1 to 4 of 4
Hi, I have a huge file having ~500,000 records. I want to remove certain records satisfying some criteria (that field 3 and 4 both have value 0). I wrote following ...
  1. #1
    Just Joined!
    Join Date
    Mar 2008
    Posts
    1

    Removing specific lines from a file

    Hi,
    I have a huge file having ~500,000 records. I want to remove certain records satisfying some criteria (that field 3 and 4 both have value 0). I wrote following script, but I am getting errors:

    #!/bin/sh
    lno=0
    while read line
    do
    lno=`expr $lno + 1`
    echo "line read" $lno
    annot =`echo $line | cut -f 3 -d " "`
    gen = `echo $line | cut -f 4 -d " "`

    if [ $annot -eq 0 ]
    then
    if [ $gen -eq 0 ]
    then
    echo "delete it"
    fi
    else
    echo $line >> balancedset
    fi
    done < ./newTrainingset12.dat
    -----------------------------------------------#
    Errors:
    removeoutliers.sh: line 8: annot: command not found
    removeoutliers.sh: line 9: gen: command not found
    removeoutliers.sh: line 11: [: -eq: unary operator expected


    -----
    I tried doing the command substitution on the prompt, and it works there.
    could anyone tell me what's wrong here?
    Thanks

  2. #2
    drl
    drl is offline
    Linux Engineer drl's Avatar
    Join Date
    Apr 2006
    Location
    Saint Paul, MN, USA / CentOS, Debian, Solaris, SuSE
    Posts
    1,117
    Hi.

    Do not use whitespace around "=" in assignment statements.

    That's a common typo for folks who go back and forth among many languages.

    On the other hand, you must supply whitespace around "[" and "]", and the relational operators, like "=", in the if statements.

    Hang in there, your fingers will soon learn what's right where ... cheers, drl
    Welcome - get the most out of the forum by reading forum basics and guidelines: click here.
    90% of questions can be answered by using man pages, Quick Search, Advanced Search, Google search, Wikipedia.
    We look forward to helping you with the challenge of the other 10%.
    ( Mn, 2.6.n, AMD-64 3000+, ASUS A8V Deluxe, 1 GB, SATA + IDE, Matrox G400 AGP )

  3. #3
    scm
    scm is offline
    Linux Engineer
    Join Date
    Feb 2005
    Posts
    1,044
    And you can short-circuit some of your tests:
    Code:
    if [ $annot -eq 0 -a $gen -eq 0 ] 
    then
    ...
    Don't forget to truncate your output file before you start, in case you rerun your script:
    Code:
    >balancedset
    will do it, before you start your "while read line" loop.

  4. #4
    Linux User
    Join Date
    Aug 2006
    Posts
    458
    for serious parsing and efficiency only
    Code:
    awk '$3 != 0 && $4 != 0' file  > newfile

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
...