Results 1 to 8 of 8
I am trying to get away from using excel for something.
I have some data that is in a consistent format that I need to consolidate multiple similar lines. Hard ...
- 10-21-2009 #1Just Joined!
- Join Date
- May 2008
- Posts
- 8
help do summing in csv with bash (or whatever)
I am trying to get away from using excel for something.
I have some data that is in a consistent format that I need to consolidate multiple similar lines. Hard to describe so I will put an example below:
part, description, count, dollars, newcount, newdollars, diffcount, diffdollars
tool1, this is tool1, 1, 5.00, 2, 10.00, 1, 5.00
tool1, this is tool1, 3, 15.00, 2, 10.00, 1, 5.00
tool2, this is tool2, 1, 2.00, 2, 4.00, 1, 2.00
tool2, this is tool2, 5, 10.00, 2, 4.00, 3, 6.00
tool3, this is tool1, 1, 5.00, 2, 10.00, 1, 5.00
and I need to sum up each break in part number to get
tool1, this is tool1, 4, 20.00, 4, 20.00, 0, 0.00
tool2, this is tool2, 6, 12.00, 4, 8.00, 4, 8.00
tool3, this is tool1, 1, 5.00, 2, 10.00, 1, 5.00
I am pretty decent with bash but cant write any perl. any help would be awesome.
also, the data is in a flat text file so I would think to use awk to identify the fields by position as either part of the script or do it before hand to convert it to csv or whatever is appropriate.
thanks
- 10-22-2009 #2Debian GNU/Linux -- You know you want it.
- 10-22-2009 #3Linux Newbie
- Join Date
- Sep 2004
- Location
- UK
- Posts
- 160
Starting point (well it splits the fields up). Use another array to hold the previous result/summing.
Code:#!/bin/bash line="tool1, this is tool1, 4, 20.00, 4, 20.00, 0, 0.00" tmp[0]="`echo ${line} | cut -d , -f 1`" tmp[1]="`echo ${line} | cut -d , -f 2`" fTmp="`echo ${line} | cut -d , -f 3- | tr ',' ' '`" echo ${fTmp} | while read tmp[2] tmp[3] tmp[4] tmp[5] tmp[6] tmp[7] do echo ${tmp[0]} echo ${tmp[1]} echo ${tmp[2]} echo ${tmp[3]} echo ${tmp[4]} echo ${tmp[5]} echo ${tmp[6]} echo ${tmp[7]} doneIn a world without walls and fences, who needs Windows and Gates?
- 10-22-2009 #4Just Joined!
- Join Date
- May 2008
- Posts
- 8
.
Blinky, I dont get it? this just seems to break down the data but not process it.
GNU-Fan, I have never done any summing with awk, much less at a change in one of the fields. Can you help?
- 10-22-2009 #5Linux Newbie
- Join Date
- Sep 2004
- Location
- UK
- Posts
- 160
to runCode:#!/bin/bash tp[0]="" while read line do #line="tool1, this is tool1, 1, 5.00, 2, 10.00, 1, 5.00" a="`echo ${line} | cut -d , -f 1`" b="`echo ${line} | cut -d , -f 2`" fTmp=`echo ${line} | cut -d , -f 3- | tr ',' ' '` eval `echo ${fTmp} | while read c d e f g h do echo c=$c echo d=$d echo e=$e echo f=$f echo g=$g echo h=$h done` if [ "${tp[0]}" == ${a} ] ; then #echo same tp[2]=$((tp[2]+c)) tp[3]=$((tp[3]+d)) tp[4]=$((tp[4]+e)) tp[5]=$((tp[5]+f)) tp[6]=$((tp[6]+g)) tp[7]=$((tp[7]+h)) else #echo nsame if [ -n "${tp[0]}" ] ; then echo "${tp[0]}, ${tp[1]}, ${tp[2]}, ${tp[3]}, ${tp[4]}, ${tp[5]}, ${tp[6]}, ${tp[7]}" fi tp[0]=${a} tp[1]=${b} tp[2]=${c} tp[3]=${d} tp[4]=${e} tp[5]=${f} tp[6]=${g} tp[7]=${h} fi done echo "${tp[0]}, ${tp[1]}, ${tp[2]}, ${tp[3]}, ${tp[4]}, ${tp[5]}, ${tp[6]}, ${tp[7]}"
cat datafile | sort | script
The only thing is it can only deal with integers (eg 5 rather than 5.00 etc)In a world without walls and fences, who needs Windows and Gates?
- 10-22-2009 #6Just Joined!
- Join Date
- May 2008
- Posts
- 8
wow
ok, so I'm trying to decompile this and understand it. correct me if my assumptions are wrong.
when setting the variables a and b, you are picking the items that will not be summed right?
here is a sample of my actual data and the awk is use to get it:
I grep it to clean out the useless lines then I use awk and fieldwidths because the built in field guesser doesnt get it right.
now, the fields $1 through $8 are all just text fields while $9 - $14 are the ones I want to sum up.cat file.txt | grep ^[a-z,A-Z,0-9] | grep -v -e ^Item -e ^COST\ COMP| awk 'BEGIN {FIELDWIDTHS="4 13 4 31 7 7 6 7 11 8 11 8 11 2";} {print $1 $2 $3 $4 $5 $6 $7 $8 $9 $10 $11 $12 $13 $14;}'
here is an example of the data:
though the data is alligned in rows but I cant get that exact format to display here. basically, anything after the EA and EACH fields is the $9 - $14 that I need to sup up.ANC 2X100 25 B2100 2X100 BRASS GAUGE 005501 EA 0 0.00 0 0.00 0 0.00
ANC 2X200 25 B2200 2" 200# PRESSURE GAGE 151414 S42 EACH 1 4.98 0 0.00 1- 4.98
ANC 2X200 25 B2200 2" 200# PRESSURE GAGE 165216 EACH 0 0.00 1 4.98 1 4.98
ANC 2X3000 25 2" 3000# PRESSURE GAGE 005502 EACH 0 0.00 0 0.00 0 0.00
ANC 2X30RL 25 B230RL 2X40 REDLINE BRASS GAUG 005503 EA 0 0.00 0 0.00 0 0.00
ANC 2X400 25 B2400 2X400 BRASS GAUGE 005515 EA 0 0.00 0 0.00 0 0.00
ANC 2X4-100 25 MP1 1.00 MAG LENS 005504 EACH 0 0.00 0 0.00 0 0.00
ANC 2X4-10 25 2X4-1/4 SH 10 LENS 145804 CLS EACH 1 0.63 0 0.00 1- 0.63
ANC 2X4-10 25 2X4-1/4 SH 10 LENS 169013 EACH 0 0.00 1 0.63 1 0.63
ANC 2X4-11 25 2X4-1/4 SH 11 LENS 151919 S46 EACH 37 23.31 0 0.00 37- 23.31
ANC 2X4-11 25 2X4-1/4 SH 11 LENS 165217 EACH 0 0.00 38 23.94 38 23.94
would I be correct in saying that I could set
?a=$1
b=$2
c=$3
d=$4
e=$5
f=$6
g=$7
h= $8
I dont know what to do here:
also, I'm guessing that instead iffTmp=`echo ${line} | cut -d , -f 3- | tr ',' ' '`
I would doecho c=$c
echo d=$d
echo e=$e
echo f=$f
echo g=$g
echo h=$h
im I on the right track or way off here?echo i=$i
echo j=$j
echo k=$k
echo l=$l
echo m=$m
echo n=$n
- 10-23-2009 #7Linux Newbie
- Join Date
- Sep 2004
- Location
- UK
- Posts
- 160
The bash way won't work, it can only do integer maths.
Also where you use awk to cleanup the data, output as csv (comma delimited)
As I said the bash way won't work, so tour probably going have to use perl or something else. But you do need to get you data formatted correctly to make life easier (with respect to how fields are delimited etc)Code:ANC,2X100,25,B2100,2X100,BRASS GAUGE,005501,EA,0,0.00,0,0.00,0,0.00 ANC,2X200,25,B2200,2",200# PRESSURE GAGE,151414,S42 EACH,1,4.98,0,0.00,1-,4.98 ANC,2X200,25,B2200,2",200# PRESSURE GAGE,165216,EACH,0,0.00,1,4.98,1,4.98 ANC,2X3000,25,2",3000#,PRESSURE GAGE,005502,EACH,0,0.00,0,0.00,0,0.00 ANC,2X30RL,25,B230RL,2X40,REDLINE BRASS GAUG,005503,EA,0,0.00,0,0.00,0,0.00 ANC,2X400,25,B2400,2X400,BRASS GAUGE,005515,EA,0,0.00,0,0.00,0,0.00 ANC,2X4-100,25,MP1,1.00,MAG LENS,005504,EACH,0,0.00,0,0.00,0,0.00 ANC,2X4-10,25,2X4-1/4,SH,10 LENS,145804,CLS EACH,1,0.63,0,0.00,1-,0.63 ANC,2X4-10,25,2X4-1/4,SH,10 LENS,169013,EACH,0,0.00,1,0.63,1,0.63 ANC,2X4-11,25,2X4-1/4,SH,11 LENS,151919,S46 EACH,37,23.31,0,0.00,37-,23.31 ANC,2X4-11,25,2X4-1/4,SH,11 LENS,165217,EACH,0,0.00,38,23.94,38,23.94
In a world without walls and fences, who needs Windows and Gates?
- 10-23-2009 #8Linux Newbie
- Join Date
- Sep 2004
- Location
- UK
- Posts
- 160
should fix the limitation of integer on mathsCode:tp[2]=`echo "${tp[2]} + ${c}" | bc` tp[3]=`echo "${tp[2]} + ${d}" | bc` tp[4]=`echo "${tp[2]} + ${e}" | bc` tp[5]=`echo "${tp[2]} + ${f}" | bc` tp[6]=`echo "${tp[2]} + ${g}" | bc` tp[7]=`echo "${tp[2]} + ${h}" | bc`In a world without walls and fences, who needs Windows and Gates?


Reply With Quote
