Find the answer to your Linux question:
Page 1 of 2 1 2 LastLast
Results 1 to 10 of 11
hello, I have a very long line, it's a really long line and i want to parse names out of them. Imagine a line like this: hello my name is ...
  1. #1
    Just Joined!
    Join Date
    Feb 2005
    Posts
    37

    in need of sed/awk/cut help! thanks!

    hello,

    I have a very long line, it's a really long line and i want to parse names out of them. Imagine a line like this:

    hello my name is = Jonathan<miscellaneous text>hello my name is = Michael<miscellaneous text>hello my name is = Eric

    I want to get all the names from there and I don't know the length but I know that following every name will be a "hello" word. Any idea how i can get all these names parsed out into a file like this:

    Jonathan
    Michael
    Eric

    Thanks in advance!!!

  2. #2
    Linux Guru smolloy's Avatar
    Join Date
    Apr 2005
    Location
    CA, but from N.Ireland
    Posts
    2,413
    Quick guess
    Code:
    awk -F " hello " '{print $NF}' yourfile.txt
    I might have got the syntax wrong, but this changes the field separator from a space to " hello " (note the spaces on either side). It then prints out the last word from each "field" -- which should be the names you're looking for.

    Apologies if I got the syntax of the command wrong, but something like this should work.
    Registered Linux user #388328 || Registered LFS user #15880
    AMD 64 X2 4600+ :: 2X1GB DDR2 800 :: GeForce 9400 GT 512MB :: ASUS M2N32 Deluxe :: 4X250GB SATAII
    Need instant help? Try us on IRC -- #linuxforums on freenode

  3. #3
    Linux Guru smolloy's Avatar
    Join Date
    Apr 2005
    Location
    CA, but from N.Ireland
    Posts
    2,413
    Sorry, I was wrong. Try this
    Code:
    awk '{print $NF}' RS="hello" yourfile.txt
    This changes the record separator, not the field separator, and prints the last field from every record (which is the correct thing to do). In this case the last field before every record separator ("hello") will be a name.
    Registered Linux user #388328 || Registered LFS user #15880
    AMD 64 X2 4600+ :: 2X1GB DDR2 800 :: GeForce 9400 GT 512MB :: ASUS M2N32 Deluxe :: 4X250GB SATAII
    Need instant help? Try us on IRC -- #linuxforums on freenode

  4. #4
    Just Joined!
    Join Date
    Feb 2005
    Posts
    37
    this sort of worked. The problem is I have miscellaneous text in the middle between the hello. so for instance, like

    hello my name is = Jonathan how are you hello my name is = Michael what are you doing today? hello my name is = Eric What's up

    So there is a space after the name but I won't know exactly how long the miscellenous text will be. is there anyway to AWK out everything between the "=" sign and then space (where the miscellaneous text begins?) thanks again.

  5. #5
    Linux Newbie radoulov's Avatar
    Join Date
    Sep 2007
    Posts
    111
    Use perl:

    Code:
    perl -lne'print $1 while /=\s*(\S+)/g' file

  6. #6
    Linux User
    Join Date
    Jun 2007
    Posts
    318
    Problem is you have mutiple occurances on a line. The only way I see how to do this in awk is writing an awk program as follows, let's call in names.awk

    Code:
    {lne=$0; pos=1;
    while (pos != 0)
    { pos=index(lne,"hello my name is = ");
    if (pos == 0) next;
    pos=pos+19;
    lne=substr(lne,pos);
    $0=lne
    print $1 }
    }
    This will search the line for "hello my name is = ", remove everything before the actual name, and print the name. It repeats this until it can't find another "hello my name is = ".

    The command would be:

    Code:
    awk -f names.awk yourfile.txt

  7. #7
    Linux Newbie radoulov's Avatar
    Join Date
    Sep 2007
    Posts
    111
    Or:
    Code:
    awk 'NR>1&&$0=$1' RS=\= file

  8. #8
    Just Joined!
    Join Date
    Jun 2008
    Posts
    34
    The following works even if your miscellaneous text contains "=":

    Code:
    awk '$0=$1' RS="hello my name is =" <your file>

  9. #9
    Linux Newbie radoulov's Avatar
    Join Date
    Sep 2007
    Posts
    111
    Only with GNU Awk though ...

  10. #10
    Just Joined!
    Join Date
    Jun 2008
    Posts
    34
    Yes, this will only work with gawk which allows RS to be a regular expression instead of just single character.
    Since djcham already mentioned smolloy's suggestion a sort of worked, I assumed he is using a version Linux with gawk as the default awk.

Page 1 of 2 1 2 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
...