Results 1 to 2 of 2
Hi,
I have an access.log file in my server, and I'm trying to collect some specific information of ips that have been requesting data from a specific period of time. ...
- 11-26-2011 #1Just Joined!
- Join Date
- Mar 2011
- Posts
- 4
Command or bash file for collecting specific data based on date
Hi,
I have an access.log file in my server, and I'm trying to collect some specific information of ips that have been requesting data from a specific period of time. An excerpt of this file is:
I would like to check which ip_addresses were requesting whatever url from a specific period of time, e.g. 25/Nov/2011:00:03:14 - 25/Nov/2011:00:13:14Code:ip_address - - [25/Nov/2011:00:03:08 -0800] "POST url_requested HTTP/1.1" 200 6707 "url_requested" "Mozilla/5.0 (Windows NT 6.0; rv:8.0) Gecko/20100101 Firefox/8.0" ip_address - - [25/Nov/2011:00:03:11 -0800] "GET url_requested HTTP/1.1" 304 172 "url_requested" "Mozilla/5.0 (Windows NT 6.0; rv:8.0) Gecko/20100101 Firefox/8.0" ip_address - - [25/Nov/2011:00:03:15 -0800] "GET url_requested HTTP/1.1" 304 171 "url_requested" "Mozilla/5.0 (Windows NT 6.0; rv:8.0) Gecko/20100101 Firefox/8.0" ip_address - - [25/Nov/2011:00:03:18 -0800] "GET url_requested HTTP/1.1" 304 172 url_requested" "Mozilla/5.0 (Windows NT 6.0; rv:8.0) Gecko/20100101 Firefox/8.0" ip_address - - [25/Nov/2011:00:03:12 -0800] "GET url_requested HTTP/1.1" 304 171 "url_requested" "Mozilla/5.0 (Windows NT 6.0; rv:8.0) Gecko/20100101 Firefox/8.0" ip_address - - [25/Nov/2011:00:03:21 -0800] "GET url_requested HTTP/1.1" 304 171 "url_requested" "Mozilla/5.0 (Windows NT 6.0; rv:8.0) Gecko/20100101 Firefox/8.0" ip_address - - [25/Nov/2011:00:03:14 -0800] "GET url_requested HTTP/1.1" 304 171 "url_requested" "Mozilla/5.0 (Windows NT 6.0; rv:8.0) Gecko/20100101 Firefox/8.0"
Thanks in advance
- 11-26-2011 #2Linux Engineer
- Join Date
- Apr 2006
- Location
- Saint Paul, MN, USA / CentOS, Debian, Solaris, SuSE
- Posts
- 1,117
Hi.
You will need to perform at least 2 basic tasks as I see it:
Here is a sketch of a shell script that lists my context, uses one of your data lines, and demonstrates one way to do both tasks:Code:1) Extract the time strings 2) Convert the time strings into some arithmetic form for comparison a) Make the string conformable to a standard form for date b) Find a reference from which we can use values for comparison
producing:Code:#!/usr/bin/env bash # @(#) s1 Demonstrate conversion and comparison of times. pe() { for _i;do printf "%s" "$_i";done; printf "\n"; } pl() { pe;pe "-----" ;pe "$*"; } db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; } db() { : ; } C=$HOME/bin/context && [ -f $C ] && $C sed date FILE=${1-data1} pl " Data file $FILE contains:" cat $FILE pl " Extraction of time string:" #2345678901234567890123456789012345678901234567890 t0=$( sed 's|^.*\[\(.*\)\].*$|\1|' $FILE ) echo "$t0" # %s seconds since 1970-01-01 00:00:00 UTC pe t1="25/Nov/2011:00:03:08 -0800" t2=$( echo "$t1" | sed 's|/|-|g;s/:/ /' ) echo "$t1 -> $t2" date --date="$t2" date --date="$t2" '+%s' start=$( date --date="$t2" '+%s' ) pe t1="25/Nov/2011:00:03:11 -0800" t2=$( echo "$t1" | sed 's|/|-|g;s/:/ /' ) echo "$t1 -> $t2" date --date="$t2" date --date="$t2" '+%s' this_time=$( date --date="$t2" '+%s' ) pe if [ $this_time -gt $start ] then echo " this_time was greater than start time." echo " ( Extract ip from line and print here. )" fi exit 0
See man pages for sed and date and use this as a guide to create your own solution.Code:% ./s1 Environment: LC_ALL = C, LANG = C (Versions displayed with local utility "version") OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64 Distribution : Debian GNU/Linux 5.0.8 (lenny) GNU bash 3.2.39 GNU sed version 4.1.5 date (GNU coreutils) 6.10 ----- Data file data1 contains: ip_address - - [25/Nov/2011:00:03:08 -0800] "POST url_requested HTTP/1.1" 200 6707 "url_requested" "Mozilla/5.0 (Windows NT 6.0; rv:8.0) Gecko/20100101 Firefox/8.0" ----- Extraction of time string: 25/Nov/2011:00:03:08 -0800 25/Nov/2011:00:03:08 -0800 -> 25-Nov-2011 00:03:08 -0800 Fri Nov 25 02:03:08 CST 2011 1322208188 25/Nov/2011:00:03:11 -0800 -> 25-Nov-2011 00:03:11 -0800 Fri Nov 25 02:03:11 CST 2011 1322208191 this_time was greater than start time. ( Extract ip from line and print here. )
Best wishes ... cheers, drlWelcome - get the most out of the forum by reading forum basics and guidelines: click here.
90% of questions can be answered by using man pages, Quick Search, Advanced Search, Google search, Wikipedia.
We look forward to helping you with the challenge of the other 10%.
( Mn, 2.6.n, AMD-64 3000+, ASUS A8V Deluxe, 1 GB, SATA + IDE, Matrox G400 AGP )


Reply With Quote