Find the answer to your Linux question:
Results 1 to 4 of 4
Hi, I have a small bash/awk program that extracts the date/time/size of thousands of email headers. I'm trying to also extract the last "Received from:" string from these email headers ...
  1. #1
    bvt
    bvt is offline
    Just Joined!
    Join Date
    Oct 2008
    Location
    St. Louis MO area
    Posts
    2

    trying to extract source email address - awk?

    Hi,

    I have a small bash/awk program that extracts the date/time/size of thousands of email headers. I'm trying to also extract the last "Received from:" string from these email headers which will give me the senders email server. Any suggestions on extracting the last occurrence of this string, and printing the information after it?

    tia
    Barry

  2. #2
    Linux Newbie
    Join Date
    Apr 2007
    Posts
    119
    Do you have an example of the input?

  3. #3
    bvt
    bvt is offline
    Just Joined!
    Join Date
    Oct 2008
    Location
    St. Louis MO area
    Posts
    2

    extracting Received and Date from emails

    Mark,
    below is an abbreviated message header. Patsie, from programming forums gave me this snippet. That works, but now I'm trying to find the first instance of the Date:. My little bash/awk program extracts this information and provides statistics about email usage.
    thank you,
    Barry

    ----------------code snippet from Patsie for 'last' occurrence------------------------

    R=$(awk -F: '/^Received: from/ { sender = $2; } END { print sender; }' $b)

    -------------abbreviated email header --------------

    From: <MicrosoftExchange329e71ec88ae461536ab6ce41109e@my site.com>
    To: <barry@mysite.com>
    Date: Tue, 31 Mar 2009 10:29:48 -0500
    ----boundary-LibPST-iamunique-13546804_-_-
    --alt---boundary-LibPST-iamunique-13566804_-
    Delivery has failed to these recipients or distribution lists:
    Received: from AAR-MV08-01.ffaa.aapps.com.com ([152.5.33.42]) by
    52vejx-ht-002.ffaa.aapps.com.com ([152.5.32.28]) with mapi; Tue, 31 Mar
    2009 10:29:48 -0500
    Content-Type: application/ms-tnef; name="winmail.dat"
    Date: Tue, 31 Mar 2009 10:29:45 -0500
    --alt---boundary-LibPST-iamunique-135866804_-
    Date: Tue, 31 Mar 2009 10:29:45 -0500
    Subject: RE: NEGT ACTION ITEMS: 26 Mar fgr Transformation Stakeholders
    Thread-Topic: NEGT ACTION ITEMS: 26 Mar fgr Transformation Stakeholders
    Thread-Index: Acmts6ZYnnafgKN1SkmK5lOvXHt5aAAxIFDQP18AAAJB3YAACG IuA
    ----boundary-LibPST-iamunique-1354866804_-_-
    Content-Type: message/rfc822
    From "MAILER-DAEMON" Tue Mar 31 10:29:45 2009
    Received: from AAR-MV08-01.ffaa.aapps.com.com ([152.5.33.42]) by
    52vejx-ht-002.ffaa.aapps.com.com ([152.5.32.28]) with mapi; Tue, 31 Mar
    2009 10:29:48 -0500
    From: "vt, BARRY J US fGA fGA/EA"
    <barry@mysite.com>
    To: "Smit, Noland l Body" <Noland.Smit@mysite.com>, "Rogers,
    Larry E Civ US fGA fGA/ECI"
    CC: fGA/ECI NEGTOPS Intation Office <fga.eci@mysite.com>
    Date: Tue, 31 Mar 2009 10:29:45 -0500
    Subject: RE: NEGT ACTION ITEMS: 26 Mar fgr Transformation Stakeholders

  4. #4
    Just Joined!
    Join Date
    Jun 2010
    Posts
    6

    Have you tried grep -m 1 "^Date:" ?

    From "man grep":

    -m NUM, --max-count=NUM
    Stop reading a file after NUM matching lines. If the input is
    standard input from a regular file, and NUM matching lines are
    output, grep ensures that the standard input is positioned to
    just after the last matching line before exiting, regardless of
    the presence of trailing context lines. This enables a calling
    process to resume a search. When grep stops after NUM matching
    lines, it outputs any trailing context lines. When the -c or
    --count option is also used, grep does not output a count
    greater than NUM. When the -v or --invert-match option is also
    used, grep stops after outputting NUM non-matching lines.

    That should give you the first match of "Date:". Then you can pipe that output to the awk snippet with a few mods (/^Date:/ instead of /^Received: from/). I don't have a terminal to test with, so haven't verified the syntax, but that should work.

    $ grep -m 1 "^Date:" myfile.txt | awk -F: '/^Date:/ { print $2 }'

    Looks like all the awk code is doing is printing the 2nd field, so I shortened it a bit.

    Good luck!
    -dufftime

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
...