Results 1 to 8 of 8
Hello
I need to write a script to parse a string and get 4 tokens from the strings and the remaining tokens as single token. I will explain with examle
...
- 09-22-2011 #1Just Joined!
- Join Date
- Jul 2008
- Posts
- 9
AWK : Need help to print last bunch of patterns
Hello
I need to write a script to parse a string and get 4 tokens from the strings and the remaining tokens as single token. I will explain with examle
input string = "80 00 00 01 00 09 08 02 80 01 5a 08 02"
Now I need to get it as below
"8000,0001,0009,080280015a0802"
I guess we can do it with AWK. But not getting a solution quickly .. could some one please help me on this. If not awk what would be the best way to get this.
PS: the input string length may vary. it is not fixed length.
Thanks
~S
- 09-22-2011 #2If your shell doesn't support here-strings:Code:
% s='80 00 00 01 00 09 08 02 80 01 5a 08 02' % awk '(ORS=!(NR%2)&&NR<8?OFS:x)||1' OFS=, RS=\ ORS=\\n<<<"$s" 8000,0001,0009,080280015a0802
Code:printf '%s\n' "$s"|awk '(ORS=!(NR%2)&&NR<8?OFS:x)||1' OFS=, RS=\ ORS=\\n
- 09-22-2011 #3Just Joined!
- Join Date
- Jul 2008
- Posts
- 9
Thanks a lot for the help. sorry to say that I am just a novice to awk script. could you please help me understand this script.
- 09-22-2011 #4
I'll try, it somehow not trivial and I'm not sure
you really want to know all that
An awk program consists of series of rules.
A rule consists of pattern followed by an action,
either the pattern or the action can be omitted, but not both
In this small program we have a pattern only and
in this case the pattern is an expression:
After the program code we set some built-in variables:Code:(ORS = !(NR % 2) && NR < 8 ? OFS : x) || 1
OFS=, - the Output Field Separator is set to a comma ','Code:OFS=, RS=\ ORS=\\n
RS=\ - the input Record Sepataror is set to a single space so every non-space character
becomes a separate record, just like this:
The ORS part is unnecessary (I set it at the beginning and I forgot to remove it).Code:$ s='80 00 00 01 00 09 08 02 80 01 5a 08 02' $ awk 1 RS=\ <<<"$s" 80 00 00 01 00 09 08 02 80 01 5a 08 02
ORS stands for Output Record Separator.
So our expression consists of two sub-expressions and the logical OR operator:
sub-expression 1:
sub-expression 2:Code:( ORS = !(NR % 2) && NR < 8 ? OFS : x)
Binary logical OR operator:Code:1
So the entire expression evaluates to true when either the firstCode:||
or the second expression evaluates true.
This is actually a shortcut because I'm not really interested
in the result, so I'm artistically forcing the entire result to be true
by adding the OR operator and the second sub-expression 1,
so that, as far as the awk programming language is concerned,
the entire expression evaluates always true (in awk 0 (for numbers)
and NULL "" (for strings) evaluate false, everything else evaluates true)
Any value OR true is always true.
So now that you understand the:part, let meCode:||1 ...
explain the other one (the first sub-expression):
It sets the ORS (Output Record Separator):Code:ORS = !(NR % 2) && NR < 8 ? OFS : x
- if the modulus of the current record number NR is 0 AND the
current record number is less than 8, it sets the ORS to OFS (a comma).
- otherwise it sets it to NULL (x is an uninitialized variable).
In our case NR % 2 is 0 when NR is 2, 4 e 6:
So we want a comma after the second, the fourth and the sixth records.Code:$ awk 'NR < 8 { print NR, "=>", NR%2 }' RS=\ <<<"$s" 1 => 1 2 => 0 3 => 1 4 => 0 5 => 1 6 => 0 7 => 1
And we have this:
What's missing? The records terminated by x (the uninitialized variable with NULL value).Code:$ awk 'END { print "\n" } ORS=!(NR%2)&&NR<8?OFS:x' OFS=, RS=\ <<<"$s" 00,01,09,
Why? Because ORS=x is ORS=NULL and the return value of this assignment is the assigned value,
which happens to evaluate false. We need to output those records as well, so we need to add
the second sub-expression: 1:
Now we have all we need.Code:awk '(ORS=!(NR%2)&&NR<8?OFS:x)||1' OFS=, RS=\ <<<"$s" 8000,0001,0009,080280015a0802
Hope this helps.
- 09-22-2011 #5Just Joined!
- Join Date
- Jul 2008
- Posts
- 9
That was a great explanation - thanks a ton for your help .. I think I need to dive deep into AWK .. I use very minimal features of AWK normally - like print and some small input pattern matching etc. this type of stuffs make life more fun !!!!
- 09-26-2011 #6Just Joined!
- Join Date
- Jul 2008
- Posts
- 9
I am confused on NR and NF concepts. NR is number of records and NF is number of Fields. So when we feed a line as an input to be processed there NR should be 1 and NF should be the number of fields seperated by [space] ( here assumption is FS is space and RS is \n ). But here we are doing our processing based on NR -- ORS = !(NR % 2) && NR < 8 ? OFS : x. So how does this work.
Thanks
Salil
- 09-26-2011 #7
Hi Salil,
remember that we modified the RS so the records are separated by a white space, as far as the awk processing is concerned - fields become records. It's NR, but we actually process the fields NF.
- 09-26-2011 #8Just Joined!
- Join Date
- Jul 2008
- Posts
- 9
aah .. thanks for that pointer !!


Reply With Quote