Find the answer to your Linux question:
Results 1 to 9 of 9
Hi all, I understand how to find a specific pattern but how do you "print" that specific pattern your looking for instead printing the whole line that pattern is on. ...
  1. #1
    Just Joined!
    Join Date
    Oct 2006
    Posts
    24

    extracting specific line instead of printing whole line using awk/sed

    Hi all,

    I understand how to find a specific pattern but how do you "print" that specific pattern your looking for instead printing the whole line that pattern is on. In my case, I have to extract websites address that have a similiar pattern of:

    "http://www.address.com"

    - all of the addresses in the file start with quotes, followed by "http://" and end with double quotes. So thats the thing i need to match to get those patterns out. But how do I only get the output to print out what i'm looking for and not the whole line?

    - A sample of the type of file im dealing with is below:

    asdasdsad9999123123===080"http://www.linux.com"kajnsknd123123
    ijiasdkkjnas"http://www.ign.com/reviews"nas,mdn123123as

    etc....

    Any suggestions or help would be greatly appreciated,

    Thanks

  2. #2
    Just Joined!
    Join Date
    Oct 2006
    Posts
    24
    Oh yeah i used the following sed command, to extract the pattern "http://"
    Hope that helps so you get an idea of what i'm trying to do.

    sed -n '/\"http:\/\/.*"/ p' bookmarks.html

  3. #3
    Linux User
    Join Date
    Jun 2007
    Posts
    318
    If you don't mind using a bash script instead here's one. It's crude but it works.

    Code:
    #!/bin/bash
    
    typeset -i _pos
    while read _lne
        do
            _pos="`echo "$_lne" | awk -F"^" '{print match($1,"http://")}'`"
            _tmp="`echo "$_lne^$_pos" | awk -F"^" '{print substr($1,$2)}'`"
            _tmp2='"'
            _pos="`echo "$_tmp^$_tmp2" | awk -F"^" '{print match($1,$2)}'`"
            _pos=_pos-1
            _tmp2="`echo "$_tmp^$_pos" | awk -F"^" '{print substr($1,1,$2)}'`"
            echo "$_tmp2"
        done < bookmarks.html

  4. #4
    Linux User
    Join Date
    Aug 2006
    Posts
    458
    Code:
    awk 'BEGIN{FS="\""}
    {
      for(i=1;i<=NF;i++){
       if( $i ~ /http/ ) print $i
      }
    }
    ' file
    output:
    Code:
    # ./test.sh
    http://www.linux.com
    http://www.ign.com/reviews

  5. #5
    Linux User
    Join Date
    Aug 2006
    Posts
    458
    Quote Originally Posted by vsemaska View Post
    It's crude but it works.
    awk is a general purpose language readily available in *nix that performs the job of grep,sed,cut,wc to name a few. Learning how to use awk to your advantage will certainly benefit you in future

  6. #6
    Just Joined!
    Join Date
    Oct 2006
    Posts
    24
    thanks for the help!! greatly appreciated

  7. #7
    Linux User
    Join Date
    Jun 2007
    Posts
    318
    Quote Originally Posted by ghostdog74 View Post
    awk is a general purpose language readily available in *nix that performs the job of grep,sed,cut,wc to name a few. Learning how to use awk to your advantage will certainly benefit you in future
    I manage over 30 servers that are either Red Hat Linux or (Tru64) UNIX. Some of those Tru64 UNIX servers are in one of two TruCluster UNIX clusters which are very complicated. I have to know bash, ksh, csh, perl, plus a couple of other scripting languages plus numerous software products used on the servers. I simply don't have the time or desired to learn another cryptic scripting language.

  8. #8
    Just Joined!
    Join Date
    Dec 2008
    Posts
    1

    If you like perl

    Just to note you could use something like

    Code:
    perl -e 'while(<>) {s#.+?(https?://.+?)".+#$1#;print;}' <file>

  9. #9
    Linux Newbie
    Join Date
    Jul 2008
    Posts
    181
    grep can do that:

    Code:
    grep -o '"http://[^"]\+"'

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
...