Find the answer to your Linux question:
Results 1 to 8 of 8
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1

    AWK : Need help to print last bunch of patterns


    I need to write a script to parse a string and get 4 tokens from the strings and the remaining tokens as single token. I will explain with examle

    input string = "80 00 00 01 00 09 08 02 80 01 5a 08 02"
    Now I need to get it as below


    I guess we can do it with AWK. But not getting a solution quickly .. could some one please help me on this. If not awk what would be the best way to get this.

    PS: the input string length may vary. it is not fixed length.


  2. #2
    % s='80 00 00 01 00 09 08 02 80 01 5a 08 02'
    % awk '(ORS=!(NR%2)&&NR<8?OFS:x)||1' OFS=, RS=\  ORS=\\n<<<"$s"
    If your shell doesn't support here-strings:

    printf '%s\n' "$s"|awk '(ORS=!(NR%2)&&NR<8?OFS:x)||1' OFS=, RS=\  ORS=\\n

  3. #3
    Thanks a lot for the help. sorry to say that I am just a novice to awk script. could you please help me understand this script.

  4. $spacer_open
  5. #4
    I'll try, it somehow not trivial and I'm not sure
    you really want to know all that

    An awk program consists of series of rules.
    A rule consists of pattern followed by an action,
    either the pattern or the action can be omitted, but not both

    In this small program we have a pattern only and
    in this case the pattern is an expression:

    (ORS = !(NR % 2) && NR < 8 ? OFS : x) || 1
    After the program code we set some built-in variables:

    OFS=, RS=\  ORS=\\n
    OFS=, - the Output Field Separator is set to a comma ','
    RS=\ - the input Record Sepataror is set to a single space so every non-space character
    becomes a separate record, just like this:

    $ s='80 00 00 01 00 09 08 02 80 01 5a 08 02'
    $ awk 1 RS=\  <<<"$s"
    The ORS part is unnecessary (I set it at the beginning and I forgot to remove it).
    ORS stands for Output Record Separator.

    So our expression consists of two sub-expressions and the logical OR operator:

    sub-expression 1:

    ( ORS = !(NR % 2) && NR < 8 ? OFS : x)
    sub-expression 2:

    Binary logical OR operator:

    So the entire expression evaluates to true when either the first
    or the second expression evaluates true.

    This is actually a shortcut because I'm not really interested
    in the result, so I'm artistically forcing the entire result to be true
    by adding the OR operator and the second sub-expression 1,
    so that, as far as the awk programming language is concerned,
    the entire expression evaluates always true (in awk 0 (for numbers)
    and NULL "" (for strings) evaluate false, everything else evaluates true)
    Any value OR true is always true.

    So now that you understand the:
    ||1 ...
    part, let me
    explain the other one (the first sub-expression):

    ORS = !(NR % 2) && NR < 8 ? OFS : x
    It sets the ORS (Output Record Separator):
    - if the modulus of the current record number NR is 0 AND the
    current record number is less than 8, it sets the ORS to OFS (a comma).
    - otherwise it sets it to NULL (x is an uninitialized variable).

    In our case NR % 2 is 0 when NR is 2, 4 e 6:

    $ awk 'NR < 8 { print NR, "=>", NR%2 }' RS=\ <<<"$s"
    1 => 1
    2 => 0
    3 => 1
    4 => 0
    5 => 1
    6 => 0
    7 => 1
    So we want a comma after the second, the fourth and the sixth records.
    And we have this:

    $ awk 'END { print "\n" } ORS=!(NR%2)&&NR<8?OFS:x' OFS=, RS=\  <<<"$s"
    What's missing? The records terminated by x (the uninitialized variable with NULL value).
    Why? Because ORS=x is ORS=NULL and the return value of this assignment is the assigned value,
    which happens to evaluate false. We need to output those records as well, so we need to add
    the second sub-expression: 1:

    awk '(ORS=!(NR%2)&&NR<8?OFS:x)||1' OFS=, RS=\  <<<"$s"
    Now we have all we need.

    Hope this helps.

  6. #5
    That was a great explanation - thanks a ton for your help .. I think I need to dive deep into AWK .. I use very minimal features of AWK normally - like print and some small input pattern matching etc. this type of stuffs make life more fun !!!!

  7. #6
    I am confused on NR and NF concepts. NR is number of records and NF is number of Fields. So when we feed a line as an input to be processed there NR should be 1 and NF should be the number of fields seperated by [space] ( here assumption is FS is space and RS is \n ). But here we are doing our processing based on NR -- ORS = !(NR % 2) && NR < 8 ? OFS : x. So how does this work.


  8. #7
    Hi Salil,
    remember that we modified the RS so the records are separated by a white space, as far as the awk processing is concerned - fields become records. It's NR, but we actually process the fields NF.

  9. #8
    aah .. thanks for that pointer !!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts