Results 1 to 7 of 7
I have a file which contains 5 columns of demographic data with column headings. The table has 5 rows of data all stored in a text file. The second column ...
- 03-24-2011 #1Just Joined!
- Join Date
- Mar 2011
- Posts
- 5
gawk command
I have a file which contains 5 columns of demographic data with column headings. The table has 5 rows of data all stored in a text file. The second column contains numerical data, however it has imbedded commas; i.e. 123,456,789. I would like to use awk or gawk to output the total of all of the numbers. I know how to set it up for another column which has no commas in the data and get the correct result.
If I only have a single column with the data in it I was able to get the desired result with this command:
gawk --posix -F, '/^,/ (tot+=$1 $2 $3); END{print tot}' file.txt
The computer echo'd each row and then printed the total at the bottom. My challenge is how do I get it to ignore the first column and use the second column in the original file.
Can anyone help with this problem? Thanks.
Larry
- 03-28-2011 #2Linux User
- Join Date
- Jan 2007
- Location
- cleveland
- Posts
- 452
have you tried using "cut" to extract the second column from "file.txt"
then piping the result to "gawk" Something like this
cut -d, -f2,3,4 <file.txt | gawk....the sun is new every day (heraclitus)
- 03-28-2011 #3Just Joined!
- Join Date
- Mar 2011
- Posts
- 5
GAWK command
Thanks. I'll try it later tonight and get back with you. Quick question - will it permanently delete the first column or only for this command?
- 03-29-2011 #4Just Joined!
- Join Date
- Mar 2011
- Posts
- 5
GAWK Command
I tried this:
cut -d, -f2,3,4 file.txt|gawk ....
and the out just removed the first column and the digits up to the first comma.
I played around with the cut command replacing the -d, with a TAB and it did delete the first column but I got 0 as the total. Ran out of time to try some other options. Will look at it this weekend. Thanks for the suggestion.
- 03-30-2011 #5Linux User
- Join Date
- Jan 2007
- Location
- cleveland
- Posts
- 452
please post a sample of the troublesome table
the sun is new every day (heraclitus)
- 03-31-2011 #6Just Joined!
- Join Date
- Mar 2011
- Posts
- 5
GAWK Command
This is the table I was working with:
Countries Area (sq-km) Population Life_Expectancy
Afghanistan 647,500 28513677 42.46
Cambodia 181,040 13363421 58.41
Canada 9,984,670 32507874 79.96
Mexico 1,972,550 104959594 74.94
US 9,631,418 293027571 77.43
In the original table the data fields were seperated by TABS.
This was for a school lab project where we had to output:
1. The total population for all the countries
2. The average Life Expectancy for all the countries.
Number 1 and 2 were fairly simple. The instructor gave us a challenge which was:
3. The total area for all the countries.
That is what I was having a problem with.
- 04-01-2011 #7Linux User
- Join Date
- Jan 2007
- Location
- cleveland
- Posts
- 452
we're really not supposed to do homework

if the input file is called "A"--and we neglect the column headings, or use
"tail" to remove them--then something like this should work OK:
cut -f2 A | sed 's/,//g' | awk '{printf "%7d\n",$0}' |awk '{a+=$1}END{print a}'
the first call to "awk" uses "printf" to right-align the digits.the sun is new every day (heraclitus)


Reply With Quote
