Find the answer to your Linux question:
Page 1 of 2 1 2 LastLast
Results 1 to 10 of 15
Good Evening Everyone! I am trying to get a little bit of help in a few different scripts that I am making. The first one I am trying to get ...
  1. #1
    Just Joined!
    Join Date
    Oct 2007
    Location
    Houston
    Posts
    75

    simple bash script help.

    Good Evening Everyone!

    I am trying to get a little bit of help in a few different scripts that I am making. The first one I am trying to get working is making a script that will read one file and if there is more then 2 lines that are the the same it will write 1 line to another file. Here is what I mean

    I have a file that contains the lines that are separated by a tab.

    data1 data2 data3 data4 data5

    Now the lines are repeating in some cases. I may see data1 up to about 21 times in a file. But the fields data1 data2 may remain the same with data4 and data5 do change (data 4 is a path and data5 is a date). What I want to do is if I see the entire line contents the same more then 2 times or more (if I see the line only 1 time I don't care) I want to write all of the data to a file.

    My first thought was try a regex expression with grep. But I have failed when I saw that it was not what I was really wanting. Any thoughts on this would greatly be appreciated and I would be willing to spring for the first few rounds at your local pub/bar!

  2. #2
    Linux Enthusiast
    Join Date
    Aug 2006
    Posts
    631
    Hi,

    To write the double lines to another file you can do something like:

    Code:
    sort file|awk '{if($0 == line){print}else{line = $0}}' >> anotherfile
    Regards

  3. #3
    Just Joined!
    Join Date
    Oct 2007
    Location
    Houston
    Posts
    75
    For some reason that command is giving me only so much of the file like 10 or so lines. The file has over 5,000 lines in there.

  4. #4
    Linux Enthusiast
    Join Date
    Aug 2006
    Posts
    631
    Try the uniq command. To output only unique lines:

    Code:
    sort file|uniq
    To show the double lines (1 time):

    Code:
    sort file|uniq -d
    Check the man page of uniq.

    Regards

  5. #5
    Just Joined!
    Join Date
    Oct 2007
    Location
    Houston
    Posts
    75
    Thank you for your response. I am noticing that the display is only 16 lines. Is this a limitation to sort? I tried to use cat in place of sort and I don't get anything to display.

    I do appreciate the help very much!

  6. #6
    Linux Enthusiast
    Join Date
    Aug 2006
    Posts
    631
    It's not really clear how your file looks like and what output you're expecting.
    Suppose you have a file like this:

    Code:
    line 5 /data/path name5
    line 1 /data/path name1
    line 2 /data/path name2
    line 4 /data/path name4
    line 3 /data/path name3
    line 1 /data/path name1
    line 2 /data/path name2
    line 6 /data/path name6
    line 1 /data/path name1
    Sorted:

    Code:
    line 1 /data/path name1
    line 1 /data/path name1
    line 1 /data/path name1
    line 2 /data/path name2
    line 2 /data/path name2
    line 3 /data/path name3
    line 4 /data/path name4
    line 5 /data/path name5
    line 6 /data/path name6
    With the awk command you'll get this output:

    Code:
    line 1 /data/path name1
    line 1 /data/path name1
    line 2 /data/path name2
    with the first uniq command:

    Code:
    line 1 /data/path name1
    line 2 /data/path name2
    line 3 /data/path name3
    line 4 /data/path name4
    line 5 /data/path name5
    line 6 /data/path name6
    and with the second uniq command:

    Code:
    line 1 /data/path name1
    line 2 /data/path name2
    Regards

  7. #7
    Just Joined!
    Join Date
    Oct 2007
    Location
    Houston
    Posts
    75
    The file is a copy of backup schedules I have running. Its an activity log really. The file is broken down like this I have 4 columns and then about 5,000 lines.

    What has happened is I need to be able to take this file and keeping the lines the way they are write any line that is duplicated more then 2 times only 1 time in another file. Here is a sample of modified information that the file looks like:

    Status file shows:
    Code:
    BKP1000        AID0002    Backup\Path      Sunday 10:56:02 PM
    BKP1000        AID0002    Backup\Path      Saturday 10:56:02 PM
    BKP1000        AID0002    Backup\Path      Friday 10:56:02 PM
    BKP1000        AID0002    Backup\Path      Thursday 10:56:02 PM
    BKP1000        AID0002    Backup\Path      Sunday 10:56:02 PM
    BKP1000        AID0002    Backup\Path      Saturday 10:56:02 PM
    BKP1000        AID0002    Backup\Path      Friday 10:56:02 PM
    BKP1000        AID0002    Backup\Path      Thursday 10:56:02 PM
    BKP3003        AID0005    Backup\Path      Monday 10:56:02 PM
    BKP3003        AID0005    Backup\Path      Sunday 10:56:02 PM
    BKP3003        AID0005    Backup\Path      Monday 10:56:02 PM
    BKP3003        AID0005    Backup\Path      Monday 10:56:02 PM
    ETC

    Now some of these lines repeat them self and some do not. The ones that do I want them written into another file. I will call the file that I want the duplicate line written to in the new file called test.

    Test should looke like:
    Code:
    BKP1000        AID0002    Backup\Path      Sunday 10:56:02 PM
    BKP1000        AID0002    Backup\Path      Saturday 10:56:02 PM
    BKP1000        AID0002    Backup\Path      Friday 10:56:02 PM
    BKP1000        AID0002    Backup\Path      Thursday 10:56:02 PM
    BKP3003        AID0005    Backup\Path      Sunday 10:56:02 PM
    BKP3003        AID0005    Backup\Path      Monday 10:56:02 PM
    The status file (first file) has over 5,000 lines of data. Some of repeat more then 2 times (sometimes 3) and others won't repeat. I need all of the ones in the file that repeat. Also the Backup\Path will may change with the BKP#### and AID#### staying the same.

    Franklin I can't thank you enough for your looking into this. I did not think it would be as hard as this. I have been messing with this file for a while now (stripping useless information and removing characters shell doesn't like).
    Last edited by Korelis; 10-10-2007 at 05:49 PM. Reason: Cleaning up grammer.

  8. #8
    Just Joined!
    Join Date
    Oct 2007
    Location
    Houston
    Posts
    75
    I think I figured out why its not working. The lines are not in fact unique because of the time stamp. I think there is a way with awk I can remove those characters after the day. I am having trouble finding that command though.

  9. #9
    Linux Enthusiast
    Join Date
    Aug 2006
    Posts
    631
    If the repeated lines are in the file test why do have the following line in it?

    Code:
    BKP3003        AID0005    Backup\Path      Sunday 10:56:02 PM
    It's is not repeated in the status file.

    Must the numbers be ignored in the first 2 columns?

    Quote Originally Posted by Korelis View Post
    I think I figured out why its not working. The lines are not in fact unique because of the time stamp. I think there is a way with awk I can remove those characters after the day. I am having trouble finding that command though.
    You can print the first 4 columns without the time with awk as follow:

    Code:
    awk 'print $1, $2, $3, $4' file
    Regards

  10. #10
    Just Joined!
    Join Date
    Oct 2007
    Location
    Houston
    Posts
    75
    Quote Originally Posted by Franklin52 View Post
    If the repeated lines are in the file test why do have the following line in it?

    Code:
    BKP3003        AID0005    Backup\Path      Sunday 10:56:02 PM
    It's is not repeated in the status file.

    Must the numbers be ignored in the first 2 columns?



    You can print the first 4 columns without the time with awk as follow:

    Code:
    awk 'print $1, $2, $3, $4' file
    Regards
    Hello Franklin,

    I think what happened is the script worked like it was suppose to but since the time area contains seconds and the seconds differ on some of the lines it refused to print them.

    You are correct about the line. My Copy Paste didn't work the way I wanted it to. It should be just like you said if it repeats 2 times in the status file then write 1 line of it in the test file. Each BKP and AID information on that line is going to contain different paths. Since the path is changing it may yeild a different day and time that those lines will be seen.

    I hope I haven't confused you to much.

Page 1 of 2 1 2 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
...