Find the answer to your Linux question:
Results 1 to 8 of 8
Like Tree1Likes
  • 1 Post By JoeZ
I have 3 files: The first one have 3 columns (stat1.txt): 3.4 97.3 7 3.2 96.2 5 3.0 95.0 6 file 2 have also 3 columns (stat2.txt) 3.3 96.0 6 ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    yat
    yat is offline
    Just Joined!
    Join Date
    May 2011
    Posts
    8

    Calculate differences, combining columns on seperate files into 1 file


    I have 3 files:

    The first one have 3 columns (stat1.txt):

    3.4 97.3 7
    3.2 96.2 5
    3.0 95.0 6

    file 2 have also 3 columns (stat2.txt)

    3.3 96.0 6
    3.1 94.5 8
    2.9 94.1 6

    file 3 got 2 columns (num.txt):
    3.2 0.5
    3.3 0.7
    2.9 0.4

    i want to calculate the differences in value for column 2 for file number 1 and 2 like this:
    97.3 - 96.0 = 1.3
    96.2 - 94.5 = 1.7
    95.0 - 94.1 = 0.9

    and combined column 2 in file number 3 (num.txt), column 2 file no 1 (stat1.txt), column 2 file no 2 (stat2.txt), and differences in result and then print into 1 file like this:

    0.5 97.3 96.0 1.3
    0.7 96.2 94.5 1.7
    0.4 95.0 94.1 0.9

    any idea how using awk script or grep or whatever script...appreciate your help...

  2. #2
    Linux Newbie
    Join Date
    Nov 2012
    Posts
    224
    hi,

    awk would be easier to use as there is some arithmetics

    read each file into an array.
    at the END, read the arrays to do what's needed.

    more or less.

  3. #3
    drl
    drl is offline
    Linux Engineer drl's Avatar
    Join Date
    Apr 2006
    Location
    Saint Paul, MN, USA / CentOS, Debian, Slackware, {Free, Open, Net}BSD, Solaris
    Posts
    1,283
    Hi.

    Here is a comparison between awk and manstat dm.
    Code:
    #!/usr/bin/env bash
    
    # @(#) s1	Demonstrate awk and dm comparison for simple arithmetic on fields.
    # See: http://hcibib.org/perlman/stat/index.html
    
    # Utility functions: print-as-echo, print-line-with-visual-space, debug.
    # export PATH="/usr/local/bin:/usr/bin:/bin"
    pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
    pl() { pe;pe "-----" ;pe "$*"; }
    db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
    db() { : ; }
    C=$HOME/bin/context && [ -f $C ] && $C paste awk dm
    
    pl " Input data file data[123]:"
    head data[123]
    
    pl " Results, dm:"
    paste data[123] |
    tee f1 |
    dm x8 x2 x5 x2-x5
    
    pl " Results, awk:"
    awk '{print $8,$2,$5,$2-$5}' f1
    
    pl " Augmented input data by field:"
    # Generate headers.
    pe "12345678" |
    sed 's/./f& /g' > f2
    # Append data.
    cat f1 >> f2
    # Standardize separators, display aligned data.
    sed 's/\t/ /g' f2 |
    align -g4 -j_
    
    exit 0
    producing:
    Code:
    % ./s1
    
    Environment: LC_ALL = C, LANG = C
    (Versions displayed with local utility "version")
    OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
    Distribution        : Debian GNU/Linux 5.0.8 (lenny) 
    bash GNU bash 3.2.39
    paste (GNU coreutils) 6.10
    awk GNU Awk 3.1.5
    dm - ( local: ~/executable/dm, 2009-11-09 )
    
    -----
     Input data file data[123]:
    ==> data1 <==
    3.4 97.3 7
    3.2 96.2 5
    3.0 95.0 6
    
    ==> data2 <==
    3.3 96.0 6
    3.1 94.5 8
    2.9 94.1 6
    
    ==> data3 <==
    3.2 0.5
    3.3 0.7
    2.9 0.4
    
    -----
     Results, dm:
    0.5	97.3	96	1.3
    0.7	96.2	94.5	1.7
    0.4	95	94.1	0.9
    
    -----
     Results, awk:
    0.5 97.3 96.0 1.3
    0.7 96.2 94.5 1.7
    0.4 95.0 94.1 0.9
    
    -----
     Augmented input data by field:
    f1     f2      f3    f4     f5      f6    f7     f8
    3.4    97.3    7     3.3    96.0    6     3.2    0.5
    3.2    96.2    5     3.1    94.5    8     3.3    0.7
    3.0    95.0    6     2.9    94.1    6     2.9    0.4
    I find that it is sometimes simpler to combine multiple input files, and then deal with the field numbers, as opposed to the complexity of keeping track of the files while operating on them. Pasting the files together is a way of doing that.

    The final display of the combined file input data is not critical to the operation.

    See the link in the script comments if you are interested in manstat and / or dm.

    Best wishes ... cheers, drl
    Welcome - get the most out of the forum by reading forum basics and guidelines: click here.
    90% of questions can be answered by using man pages, Quick Search, Advanced Search, Google search, Wikipedia.
    We look forward to helping you with the challenge of the other 10%.
    ( Mn, 2.6.n, AMD-64 3000+, ASUS A8V Deluxe, 1 GB, SATA + IDE, Matrox G400 AGP )

  4. #4
    Linux Newbie
    Join Date
    Jun 2012
    Location
    SF Bay area
    Posts
    149
    I'd never found the paste command before and will definitely use it in the future. It will come in handy!

    Here's an another version in awk that I think is interesting too. It's only practical if the amount of records in the files is small enough that you can hold everything in memory though.

    Code:
    #!/bin/gawk -f
    
    {
        rec = ++rcount[FILENAME];
        nfile[rec]++;
        second[FILENAME ":" rec] = $2;
    }
    
    END \
    {
        k1 = "stat1:";
        k2 = "stat2:";
        k3 = "num:";
    
        for(rec = 1; rec <= NR; rec++)
          if(nfile[rec] == 3)
    	printf "%5.1f %5.1f %5.1f %5.1f\n", second[k3 rec], second[k1 rec], second[k2 rec], second[k1 rec] - second[k2 rec];
    }

  5. #5
    Just Joined!
    Join Date
    Oct 2006
    Posts
    2

    Another solution

    Quote Originally Posted by yat View Post
    I have 3 files:

    The first one have 3 columns (stat1.txt):

    3.4 97.3 7
    3.2 96.2 5
    3.0 95.0 6

    file 2 have also 3 columns (stat2.txt)

    3.3 96.0 6
    3.1 94.5 8
    2.9 94.1 6

    file 3 got 2 columns (num.txt):
    3.2 0.5
    3.3 0.7
    2.9 0.4

    i want to calculate the differences in value for column 2 for file number 1 and 2 like this:
    97.3 - 96.0 = 1.3
    96.2 - 94.5 = 1.7
    95.0 - 94.1 = 0.9

    and combined column 2 in file number 3 (num.txt), column 2 file no 1 (stat1.txt), column 2 file no 2 (stat2.txt), and differences in result and then print into 1 file like this:

    0.5 97.3 96.0 1.3
    0.7 96.2 94.5 1.7
    0.4 95.0 94.1 0.9

    any idea how using awk script or grep or whatever script...appreciate your help...
    Unless you just want to do this using a script, wouldn't it be much simpler to put each of the files in a LibreOffice Calc spreadsheet and make the calculations and transfer of numbers to appropriate columns and then print just the resultant columns to a text file?

  6. #6
    Linux Newbie
    Join Date
    Nov 2012
    Posts
    224
    yat is clear about that:
    Quote Originally Posted by yat
    any idea how using awk script or grep or whatever script

  7. #7
    Linux Enthusiast
    Join Date
    Jan 2005
    Location
    Saint Paul, MN
    Posts
    626
    Quote Originally Posted by yat View Post
    I have 3 files:

    The first one have 3 columns (stat1.txt):

    3.4 97.3 7
    3.2 96.2 5
    3.0 95.0 6

    file 2 have also 3 columns (stat2.txt)

    3.3 96.0 6
    3.1 94.5 8
    2.9 94.1 6

    file 3 got 2 columns (num.txt):
    3.2 0.5
    3.3 0.7
    2.9 0.4

    i want to calculate the differences in value for column 2 for file number 1 and 2 like this:
    97.3 - 96.0 = 1.3
    96.2 - 94.5 = 1.7
    95.0 - 94.1 = 0.9

    and combined column 2 in file number 3 (num.txt), column 2 file no 1 (stat1.txt), column 2 file no 2 (stat2.txt), and differences in result and then print into 1 file like this:

    0.5 97.3 96.0 1.3
    0.7 96.2 94.5 1.7
    0.4 95.0 94.1 0.9

    any idea how using awk script or grep or whatever script...appreciate your help...
    How about python? This example is more complicated as I did not extract only the needed columns and left it a general concept (except for the output).
    Code:
    #!/usr/bin/env python
    #
    
    def read_file_row_colum(filename):
        lines = [ ]
        with open(filename, 'r') as file_ptr:
            for line in file_ptr.readlines():
                lines.append(line.split())
        return lines
    
    # read in data for all files (refernced as filedata[fileidx][row][col] based on zero through N-1)
    filedata = [ ]
    for fn in [ 'stat1.txt', 'stat2.txt', 'num.txt' ]:
        filedata.append(read_file_row_colum(fn))
    
    
    # produce the output desired..
    for row in xrange(len(filedata[0])):
        print filedata[2][row][1], filedata[0][row][1], filedata[1][row][1], \
              (float(filedata[0][row][1])-float(filedata[1][row][1]))

  8. #8
    Just Joined!
    Join Date
    Oct 2006
    Posts
    2
    Quote Originally Posted by yat View Post
    I have 3 files:

    The first one have 3 columns (stat1.txt):

    3.4 97.3 7
    3.2 96.2 5
    3.0 95.0 6

    file 2 have also 3 columns (stat2.txt)

    3.3 96.0 6
    3.1 94.5 8
    2.9 94.1 6

    file 3 got 2 columns (num.txt):
    3.2 0.5
    3.3 0.7
    2.9 0.4

    i want to calculate the differences in value for column 2 for file number 1 and 2 like this:
    97.3 - 96.0 = 1.3
    96.2 - 94.5 = 1.7
    95.0 - 94.1 = 0.9

    and combined column 2 in file number 3 (num.txt), column 2 file no 1 (stat1.txt), column 2 file no 2 (stat2.txt), and differences in result and then print into 1 file like this:

    0.5 97.3 96.0 1.3
    0.7 96.2 94.5 1.7
    0.4 95.0 94.1 0.9

    any idea how using awk script or grep or whatever script...appreciate your help...
    from the command line:

    paste stat1.txt stat2.txt num.txt >all.txt
    awk '{print $8,$2,$5,$2-$5}' all.txt >done.txt

    OR

    place the above 2 lines in a file and execute it.

    cat done.txt
    0.5 97.3 96.0 1.3
    0.7 96.2 94.5 1.7
    0.4 95.0 94.1 0.9
    chebarbudo likes this.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •