Find the answer to your Linux question:
Results 1 to 7 of 7
Hey. I have a big text file with data, and i want to extract mail addresses. How i can do it?...
  1. #1
    Just Joined!
    Join Date
    Feb 2007
    Posts
    7

    Extract email addresses from big file.

    Hey.

    I have a big text file with data,
    and i want to extract mail addresses.

    How i can do it?

  2. #2
    Linux User
    Join Date
    Aug 2006
    Posts
    458
    you can use tools like awk/sed/perl/python etc...
    show sample of the file and your expected output.

  3. #3
    Just Joined!
    Join Date
    Feb 2007
    Posts
    7
    It is a text file with random data inside.

    example :
    ---------------------------------------------------
    adfadf dfa asgasagf koko@yahoo.com adfga asdfa aa
    sd asdf aazs dump@mail.com adfgf asdff asdfas
    sdf sg afdgag rgteggsadf gdfg sdfgsd sdfg sdfgs dfgds
    sdfgsd sdgsfg sdfgsf sdfgsf s pips@hotmail.com
    ad adfa #4t346 n5635
    --------------------------------------------------

  4. #4
    drl
    drl is offline
    Linux Engineer drl's Avatar
    Join Date
    Apr 2006
    Location
    Saint Paul, MN, USA / CentOS, Debian, Solaris, SuSE
    Posts
    1,117
    Hi.

    I augmented your data file:
    Code:
    % cat data1
    adfadf dfa asgasagf koko@yahoo.com adfga asdfa aa
    sd asdf aazs dump@mail.com adfgf asdff asdfas
    sdf sg afdgag rgteggsadf gdfg sdfgsd sdfg sdfgs dfgds
    sdfgsd sdgsfg sdfgsf sdfgsf s pips@hotmail.com
    ad adfa #4t346 n5635
    ad adfa #4t346 n5635 c@c.com o@o.org g@g.gov xxyyzz
    ad sdfg only-address@  @no-name.com lkjh
    Then ran this script:
    Code:
    #!/bin/sh
    
    # @(#) s1       Demonstrate regular expression for email address.
    
    FILE=${1-data1}
    
    grep --only-matching -E '[.[:alnum:]]+@[.[:alnum:]]+' $FILE
    To get this:
    Code:
    % ./s1
    koko@yahoo.com
    dump@mail.com
    pips@hotmail.com
    c@c.com
    o@o.org
    g@g.gov
    For a good first approximation ... cheers, drl
    Welcome - get the most out of the forum by reading forum basics and guidelines: click here.
    90% of questions can be answered by using man pages, Quick Search, Advanced Search, Google search, Wikipedia.
    We look forward to helping you with the challenge of the other 10%.
    ( Mn, 2.6.n, AMD-64 3000+, ASUS A8V Deluxe, 1 GB, SATA + IDE, Matrox G400 AGP )

  5. #5
    Linux User
    Join Date
    Aug 2006
    Posts
    458
    Code:
    awk '
    {
      for (i=1;i<=NF;i++) {
           if ( $i ~ /[[:alpha:]]@[[:alpha:]]/ )  { 
    	  print $i      
           }
      }
    }' "file"

  6. #6
    drl
    drl is offline
    Linux Engineer drl's Avatar
    Join Date
    Apr 2006
    Location
    Saint Paul, MN, USA / CentOS, Debian, Solaris, SuSE
    Posts
    1,117
    Hi.

    The awk script from ghostdog74 inspired me to simplify the solution:
    Code:
    #!/bin/sh
    
    # @(#) s5       Demonstrate most simple regular expression for email address.
    
    FILE=${1-data1}
    
    tr ' ' '\n' <$FILE |
    grep '@'
    Run on data file:
    Code:
    adfadf dfa asgasagf koko@yahoo.com adfga asdfa aa
    sd asdf aazs dump@mail.com adfgf asdff asdfas
    sdf sg afdgag rgteggsadf gdfg sdfgsd sdfg sdfgs dfgds
    sdfgsd sdgsfg sdfgsf sdfgsf s pips@hotmail.com
    ad adfa #4t346 n5635
    ad adfa #4t346 n5635 c@c.com o@o.org g@g.gov xxyyzz
    ad sdfg only-address@  @no-name.com lkjh
    sdfg wacko-one@msu.edu trendnet polk271.sam@fiddlehead.com
    fdsa wizard4@future.com  lucky@911.org
    To produce:
    Code:
    % ./s5
    koko@yahoo.com
    dump@mail.com
    pips@hotmail.com
    c@c.com
    o@o.org
    g@g.gov
    only-address@
    @no-name.com
    wacko-one@msu.edu
    polk271.sam@fiddlehead.com
    wizard4@future.com
    lucky@911.org
    And from there you can filter bad addresses if you have them ... cheers, drl
    Welcome - get the most out of the forum by reading forum basics and guidelines: click here.
    90% of questions can be answered by using man pages, Quick Search, Advanced Search, Google search, Wikipedia.
    We look forward to helping you with the challenge of the other 10%.
    ( Mn, 2.6.n, AMD-64 3000+, ASUS A8V Deluxe, 1 GB, SATA + IDE, Matrox G400 AGP )

  7. #7
    Just Joined!
    Join Date
    Feb 2007
    Posts
    7
    thanx for help dudes

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
...