Find the answer to your Linux question:
Results 1 to 2 of 2
Hi everyone, I have a file result.txt with records as following and another file mirna.txt with a list of miRNAs e.g. miR22, miR123, miR13 etc. Gene Transcript miRNA Gar Nm_111233 ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined!
    Join Date
    Aug 2012
    Posts
    1

    count the unique rows based on certain columns


    Hi everyone,

    I have a file result.txt with records as following and another file mirna.txt with a list of miRNAs e.g. miR22, miR123, miR13 etc.

    Gene Transcript miRNA

    Gar Nm_111233 miR22
    Gar Nm_123440 miR22
    Gar Nm_129939 miR22
    Hel Nm_233900 miR13
    Hel Nm_678900 miR13
    Bart Nm_178181 miR22

    Now I want to count the number of gene for each miRNA in mirna.txt


    e.g. miR22 2
    miR13 1
    miR15 0



    Previously, I used the following command but it counts every occurence of miRNA.

    for gene in `cat mirna.txt`; do awk -v gene=$gene '{for(i=1; i<=NF; i++) if ($i==gene) c++} END {print c}' result.txt>>output.txt; done;


    Any help is appreciated. Thanks in advance.


    Mic

  2. #2
    tpl
    tpl is offline
    Linux User
    Join Date
    Jan 2007
    Location
    cleveland
    Posts
    478
    welcome to the forum

    here's a quick hack that seems to work for your sample files:

    cut -d" " -f1,3 <result.txt |uniq | sort -k 2 >Result.txt
    for i in `cat mirna.txt`
    do
    echo -n $i; echo -n " ";echo `grep -c $i Result.txt`
    done
    the sun is new every day (heraclitus)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •