Find the answer to your Linux question:
Results 1 to 6 of 6
Hi there, I have a file as below K1|K2|dataA1|K3|dataA2|K4|dataA3 K1|K2|dataB1|K3|dataB2|K4|dataB3 what I need to do is sort on the keys K1,K2,K3,K4 and remove any duplicates on them (only one should ...
  1. #1
    Just Joined!
    Join Date
    Jun 2006
    Posts
    16

    sort and uniq

    Hi there,

    I have a file as below

    K1|K2|dataA1|K3|dataA2|K4|dataA3
    K1|K2|dataB1|K3|dataB2|K4|dataB3


    what I need to do is sort on the keys K1,K2,K3,K4 and remove any duplicates on them (only one should remain, could be any thing). How can I do that using sort and uniq.

  2. #2
    Linux Engineer wje_lf's Avatar
    Join Date
    Sep 2007
    Location
    Mariposa
    Posts
    1,192
    Do the following at the command line.
    Code:
    man sort
    man uniq
    If man pages are not installed on your system, google for this
    Code:
    linux man sort
    and this
    Code:
    man sort uniq
    Hope this helps.
    --
    Bill

    Old age and treachery will overcome youth and skill.

  3. #3
    Linux Newbie radoulov's Avatar
    Join Date
    Sep 2007
    Posts
    111
    Code:
    awk -F\| '!_[$1,$2,$4,$6]++' file
    This will give you the unique records,
    then use sort to sort them as you want.

  4. #4
    Linux Engineer wje_lf's Avatar
    Join Date
    Sep 2007
    Location
    Mariposa
    Posts
    1,192
    Asif Ahmed Syed, it turns out that radoulov's advice shows you that there are many ways to do things in UNIX/Linux/BSD. I'm not familiar with awk, but radoulov's solution is likely correct.

    If, however, you really do wish to do this entirely with sort and uniq, you can do so. In fact, you can do it entirely with sort.

    The man page will show you how. :)
    --
    Bill

    Old age and treachery will overcome youth and skill.

  5. #5
    Linux Newbie radoulov's Avatar
    Join Date
    Sep 2007
    Posts
    111
    wje_lf is right,
    I have to admit that I had to re-read the man pages to find the correct syntax

  6. #6
    scm
    scm is offline
    Linux Engineer
    Join Date
    Feb 2005
    Posts
    1,044
    sort has the -u flag to strip out duplicates. Note that uniq requires its input to be already sorted, but the -c flag can be useful to give a count of the number of occurrences of each unique line.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
...