Results 1 to 8 of 8
I have 3 files:
The first one have 3 columns (stat1.txt):
3.4 97.3 7
3.2 96.2 5
3.0 95.0 6
file 2 have also 3 columns (stat2.txt)
3.3 96.0 6
...
Enjoy an ad free experience by logging in. Not a member yet? Register.
- 11-23-2012 #1Just Joined!
- Join Date
- May 2011
- Posts
- 8
Calculate differences, combining columns on seperate files into 1 file
I have 3 files:
The first one have 3 columns (stat1.txt):
3.4 97.3 7
3.2 96.2 5
3.0 95.0 6
file 2 have also 3 columns (stat2.txt)
3.3 96.0 6
3.1 94.5 8
2.9 94.1 6
file 3 got 2 columns (num.txt):
3.2 0.5
3.3 0.7
2.9 0.4
i want to calculate the differences in value for column 2 for file number 1 and 2 like this:
97.3 - 96.0 = 1.3
96.2 - 94.5 = 1.7
95.0 - 94.1 = 0.9
and combined column 2 in file number 3 (num.txt), column 2 file no 1 (stat1.txt), column 2 file no 2 (stat2.txt), and differences in result and then print into 1 file like this:
0.5 97.3 96.0 1.3
0.7 96.2 94.5 1.7
0.4 95.0 94.1 0.9
any idea how using awk script or grep or whatever script...appreciate your help...
- 11-23-2012 #2Linux Newbie
- Join Date
- Nov 2012
- Posts
- 136
hi,
awk would be easier to use as there is some arithmetics
read each file into an array.
at the END, read the arrays to do what's needed.
more or less.
- 11-23-2012 #3Linux Engineer
- Join Date
- Apr 2006
- Location
- Saint Paul, MN, USA / CentOS, Debian, Solaris, SuSE
- Posts
- 1,202
Hi.
Here is a comparison between awk and manstat dm.
producing:Code:#!/usr/bin/env bash # @(#) s1 Demonstrate awk and dm comparison for simple arithmetic on fields. # See: http://hcibib.org/perlman/stat/index.html # Utility functions: print-as-echo, print-line-with-visual-space, debug. # export PATH="/usr/local/bin:/usr/bin:/bin" pe() { for _i;do printf "%s" "$_i";done; printf "\n"; } pl() { pe;pe "-----" ;pe "$*"; } db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; } db() { : ; } C=$HOME/bin/context && [ -f $C ] && $C paste awk dm pl " Input data file data[123]:" head data[123] pl " Results, dm:" paste data[123] | tee f1 | dm x8 x2 x5 x2-x5 pl " Results, awk:" awk '{print $8,$2,$5,$2-$5}' f1 pl " Augmented input data by field:" # Generate headers. pe "12345678" | sed 's/./f& /g' > f2 # Append data. cat f1 >> f2 # Standardize separators, display aligned data. sed 's/\t/ /g' f2 | align -g4 -j_ exit 0
I find that it is sometimes simpler to combine multiple input files, and then deal with the field numbers, as opposed to the complexity of keeping track of the files while operating on them. Pasting the files together is a way of doing that.Code:% ./s1 Environment: LC_ALL = C, LANG = C (Versions displayed with local utility "version") OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64 Distribution : Debian GNU/Linux 5.0.8 (lenny) bash GNU bash 3.2.39 paste (GNU coreutils) 6.10 awk GNU Awk 3.1.5 dm - ( local: ~/executable/dm, 2009-11-09 ) ----- Input data file data[123]: ==> data1 <== 3.4 97.3 7 3.2 96.2 5 3.0 95.0 6 ==> data2 <== 3.3 96.0 6 3.1 94.5 8 2.9 94.1 6 ==> data3 <== 3.2 0.5 3.3 0.7 2.9 0.4 ----- Results, dm: 0.5 97.3 96 1.3 0.7 96.2 94.5 1.7 0.4 95 94.1 0.9 ----- Results, awk: 0.5 97.3 96.0 1.3 0.7 96.2 94.5 1.7 0.4 95.0 94.1 0.9 ----- Augmented input data by field: f1 f2 f3 f4 f5 f6 f7 f8 3.4 97.3 7 3.3 96.0 6 3.2 0.5 3.2 96.2 5 3.1 94.5 8 3.3 0.7 3.0 95.0 6 2.9 94.1 6 2.9 0.4
The final display of the combined file input data is not critical to the operation.
See the link in the script comments if you are interested in manstat and / or dm.
Best wishes ... cheers, drlWelcome - get the most out of the forum by reading forum basics and guidelines: click here.
90% of questions can be answered by using man pages, Quick Search, Advanced Search, Google search, Wikipedia.
We look forward to helping you with the challenge of the other 10%.
( Mn, 2.6.n, AMD-64 3000+, ASUS A8V Deluxe, 1 GB, SATA + IDE, Matrox G400 AGP )
- 11-23-2012 #4Linux Newbie
- Join Date
- Jun 2012
- Location
- SF Bay area
- Posts
- 101
I'd never found the paste command before and will definitely use it in the future. It will come in handy!
Here's an another version in awk that I think is interesting too. It's only practical if the amount of records in the files is small enough that you can hold everything in memory though.
Code:#!/bin/gawk -f { rec = ++rcount[FILENAME]; nfile[rec]++; second[FILENAME ":" rec] = $2; } END \ { k1 = "stat1:"; k2 = "stat2:"; k3 = "num:"; for(rec = 1; rec <= NR; rec++) if(nfile[rec] == 3) printf "%5.1f %5.1f %5.1f %5.1f\n", second[k3 rec], second[k1 rec], second[k2 rec], second[k1 rec] - second[k2 rec]; }
- 11-24-2012 #5Just Joined!
- Join Date
- Oct 2006
- Posts
- 2
Another solution
- 11-24-2012 #6Linux Newbie
- Join Date
- Nov 2012
- Posts
- 136
yat is clear about that:
Originally Posted by yat
- 11-24-2012 #7Linux User
- Join Date
- Jan 2005
- Location
- Saint Paul, MN
- Posts
- 416
How about python? This example is more complicated as I did not extract only the needed columns and left it a general concept (except for the output).
Code:#!/usr/bin/env python # def read_file_row_colum(filename): lines = [ ] with open(filename, 'r') as file_ptr: for line in file_ptr.readlines(): lines.append(line.split()) return lines # read in data for all files (refernced as filedata[fileidx][row][col] based on zero through N-1) filedata = [ ] for fn in [ 'stat1.txt', 'stat2.txt', 'num.txt' ]: filedata.append(read_file_row_colum(fn)) # produce the output desired.. for row in xrange(len(filedata[0])): print filedata[2][row][1], filedata[0][row][1], filedata[1][row][1], \ (float(filedata[0][row][1])-float(filedata[1][row][1]))
- 11-25-2012 #8Just Joined!
- Join Date
- Oct 2006
- Posts
- 2


1Likes
Reply With Quote

