Results 1 to 7 of 7
hi, well, hi, how are u, please please etc etc
mi problem is, i have a table like this:
1 2 3
ab x
ab x
cd x
cd x
...
- 04-04-2011 #1Just Joined!
- Join Date
- Apr 2011
- Posts
- 4
collapsing rows - Awk
hi, well, hi, how are u, please please etc etc
mi problem is, i have a table like this:
1 2 3
ab x
ab x
cd x
cd x
(
this thing doesn't allow me to tab the table, the coordinates are:
ab:1
ab:2
cd:2
cd:3
)
and i need to end with something like this
1 2 3
ab x x
cd x x
again, coordinates should be
ab:12
cd:23
of course, not in this format but as a table
i think it can be done with awk, but im just starting at it and the clycling stuff is a some kind fuzzy for me
thanks!!Last edited by alekos; 04-04-2011 at 01:36 AM.
- 04-04-2011 #2Linux Newbie
- Join Date
- Nov 2008
- Location
- Tokyo, Japan
- Posts
- 243
I think I understand what you want. It's so easy, I'll just show you how to do it.
Input file:AWK Script:Code:a b 1 a b 2 c d 3 c d 4
Executing on the command line:Code:#collapse.awk { i = ($1 " " $2) my_keys[i] = ($1 " " $2) my_values[i] = (my_values[i] " " $3) } END { for (x in my_keys) { print(my_keys[x] " : " my_values[x]) } }Code:% awk -f collapse.awk input.txt c d : 3 4 a b : 1 2
- 04-04-2011 #3Just Joined!
- Join Date
- Apr 2011
- Posts
- 4
well, i think i cant let me undestood because the lack of formatting here (or my lack of knowledge of forums-text format)
the initial table is something like this
<tab>1<tab>2<tab>3
ab<tab>x
ab <tab><tab>x
cd <tab><tab>x
cd <tab><tab><tab>3
and i need to end with:
<tab>1<tab>2<tab>3
ab<tab>x<tab>x
cd<tab><tab>x<tab>x
thanks!
- 04-04-2011 #4Linux Newbie
- Join Date
- Nov 2008
- Location
- Tokyo, Japan
- Posts
- 243
The best place to learn about AWK is the GNU AWK User's Guide. At this site, they have many examples that are very easy to understand, and an index of important built-in functions you can use, like "length" and "sub".
The basic operation of AWK programs is simply analyze every line of an input. So each line of the input file goes one-by-one to the AWK program. Your program must contain code of the form Pattern->Action. The "Pattern" is a "regular expression". The action is simply some commands, for example "print", that are executed if the pattern matches.
AWK has many commands, including "length(x)" to count the length of the string "x", arithmetic (+ - * / %), and "system" to execute a shell command from within the program.
Here is a simple AWK program that uses regular expressions, and I think it will more closely match what you want. But it is more fun to learn it yourself, so please just use this as an example:Code:/^(..)(\t+)(.*)$/ { # ($1)($2 )($3) # $1 = any two characters # $2 = one or more <tab> characters # $3 = all characters after the <tab> characters # until the end of the line #Here is the action my_keys[$1] = $1 my_values[$1] = ($2 $3) } END { # The "END" pattern action matches the end of the the input file. for (x in my_keys) { print x "\t" my_values[x] } } #REGEX PATTERNS: # the pattern contains special characters inside of parenthases # (.) -> this will match any 1 character # for example "a", "?", or <space> # (..) -> this will match any 2 characters # for example "ab", "?_", or <space><tab> # (A+) -> this will at least 1, or more than one "A" characters # for example "A", "AA", "AAA", ... # (A*) -> this matches 0 or more than one "A" characters # for example "", "A", "AA", "AAA", ... # (\t+) -> Match 1 or more <tab> characters # (Hello$) -> Mathches "Hello" only if it is at the end of the # line. "I said Hello" matches, "Hello!" does not. # (^Hello) -> Mathches "Hello" only if it is at the beginning of # the line. "Hello world!" matches, "I said Hello" # does not match. #ACTIONS: # In the action, the parts of the pattern are assigned to $1, # $2, $3, etc. That is, the first parenthases matched are placed # in $1, the second parenthases matched are placed in $2, etc. # $0 is always equal to the whole line. # you can assign strings to variable names: # a = "Hello" # b = "world" # c = (a ", " b "!") # ...now c is "Hello, world!"
- 04-06-2011 #5Just Joined!
- Join Date
- Apr 2011
- Posts
- 4
hmm, ok, thank you, in fact, i feel like awk is like grep/sed, so is dificult to me to think in a way in wich it could go throught various lines, and compare those, etc
- 04-06-2011 #6Linux Newbie
- Join Date
- Nov 2008
- Location
- Tokyo, Japan
- Posts
- 243
Actually, I make that mistake too. In fact, I made a mistake in my previous code that I posted here!
I confused AWK with Perl! I remember now that in Perl, when you use parentheses in regular expressions, the characters matching the input line inside the parentheses are stored to variables $1, $2, $3, ... , but this is not the case in "awk"!
Sorry! Let me show you what I got wrong.I tested the above code to make sure it worked. By coincidence the above code was wrong but it still worked correctly because of the way the input was formatted, by separating the input lines by whitespaces!Code:/^(..)(\t+)(.*)$/ { # ($1)($2 )($3) <- This is true for "Perl" but not for "AWK" # $1, $2, $3 are NOT the contents of the parentheses of the above pattern. # Actually, $1, $2, and $3 are the "fields" of the input line. # Fields are created by breaking-up the input line between white-spaces. #Here is the action # lets say the input line is "ab 1" my_keys[$1] = $1 # my_key["ab"] = "ab" my_values[$1] = ($2 $3) # my_value["ab"] = ("1" "") = "1" }
Everything else I said about "regular expression" patterns is true.
Awk is a very simple programming language -- it just executes one action for every line of input that matches a pattern. If there is no pattern, the action is performed on every line. AWK does not take too much time to learn, and it is very useful.
If you have any more questions, let us know. I will make sure not to give you wrong information next time! Sorry!
- 04-06-2011 #7Just Joined!
- Join Date
- Apr 2011
- Posts
- 4
hmm, ok, i see
well, i think i have a lot to learn, thank you for that!


Reply With Quote