Find the answer to your Linux question:
Results 1 to 4 of 4
Hi, I have some problem with extracting the text from the logs. Here is the log file looks like. linenumber:INFO |Date|time|name|s=sessionid>..garbage. Example: 69736:INFO |2009/07/17|15:26:59|p=Helloworld|s=A376B050AD83524106D7023 F45233DD> Session "A376B050AD83524106D7023F45233DD" created. Here, Linenumber ...
  1. #1
    Just Joined!
    Join Date
    Aug 2006
    Posts
    2

    String extraction.

    Hi,

    I have some problem with extracting the text from the logs.
    Here is the log file looks like.

    linenumber:INFO |Date|time|name|s=sessionid>..garbage.
    Example:
    69736:INFO |2009/07/17|15:26:59|p=Helloworld|s=A376B050AD83524106D7023 F45233DD> Session "A376B050AD83524106D7023F45233DD" created.

    Here,
    Linenumber values are 1,2,3 ...etc
    INFO,Date,time,sessionid are fixed size.
    name is variable size.

    I want to retrieve all the fields(Line number, Date, Time, Sessionid from this log file)


    work done:
    Tried awk,
    grep -n 'created\|disposed' /opt/tomcat/logs/catalina.out | awk '{print $1}'
    But, this is giving the output in this format.

    Linenumber:INFO

    I tried with print$2,$3 .. but no luck.

    Please help me out.


    Thanks,
    Srikanth.

  2. #2
    Super Moderator MikeTbob's Avatar
    Join Date
    Apr 2006
    Location
    Texas
    Posts
    7,144
    Hello and Welcome.
    Posting the same thread in multiple forums is discouraged. Please continue here only. Thank you.
    I do not respond to private messages asking for Linux help, Please keep it on the forums only.
    All new users please read this.** Forum FAQS. ** Adopt an unanswered post.

  3. #3
    Just Joined!
    Join Date
    Aug 2006
    Posts
    2
    yes. I just knew my mistake.And i thought of removing the post from fedora because, this is not related to that.
    But it is already removed.

    Thanks,
    Srikanth.

  4. #4
    Trusted Penguin Cabhan's Avatar
    Join Date
    Jan 2005
    Location
    Seattle, WA, USA
    Posts
    3,230
    The real problem here that I see is that you have lots of different delimiters, one of which is a space (separating "INFO" and the date). The problem is that space is not usually a delimiter, as seen in the message following the ">". However, your line does contain a definite pattern. For this reason, I don't believe that awk is the right tool for the job. awk splits a line based on delimiters: we want to match a pattern.

    Therefore, let's use sed instead. With sed, we can do something like this:
    Code:
    sed -E 's/([0-9]+):INFO \|([0-9]+\/[0-9]+\/[0-9]+)\|([0-9]+:[0-9]+:[0-9]+)\|([^|]+)\|s=([0-9A-Za-z]+)> .+/\1 \2 \3 \4 \5/' /opt/tomcat/logs/catalina.out
    This looks pretty complicated, but it's not. Basically, I wrote a regular expression for sed that matches the format you gave, and captures the fields you want (linenumber, date, time, name, and sessionid). These are stored in \1, \2, ..., \5. I then print them out with a space between them, though you can use any delimiter you want.

    You now have your fields. Depending on what you want to do with them, the next step is up to you: in a Bash script, you could use cut to separate the actual fields from each other, or you can just change the way that sed prints them.
    DISTRO=Arch
    Registered Linux User #388732

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
...