Find the answer to your Linux question:
Results 1 to 6 of 6
I have an application that summarizes our firewall logs and generates a HTML report about the traffic, such as the Top 10 Destinations. The issue is that these destinations are ...
  1. #1
    Just Joined!
    Join Date
    Jul 2011
    Posts
    3

    Question Replace IP Addresses with Hostnames in HTML File

    I have an application that summarizes our firewall logs and generates a HTML report about the traffic, such as the Top 10 Destinations.

    The issue is that these destinations are listed as an IP addresses and we actually want it to be the reverse lookup (PTR) name instead. The issue with getting the reporting program to do this is that it tries to resolve all the IPs which there are over a million entries and this just takes hours and is not acceptable. The resulting HTML file only contains a few dozen IPs, this is why I want to run ascript against this HTML file.

    Are there any ways to write a script to search through that HTML file for all occurrences of IP addresses (in the format of xxx.xxx.xxx.xxx where there could be 1 to 3 x in each octet) assign the found IP address to a variable, run a nslookup on that variable and replace the found IP with the results of the nslookup?

    I looked through sed and awk but it doesn't seem to have a feature that allows me to assigned the found pattern to a variable so I can perform a nslookup on it.

    Can someone shed some light onto this one?

    Your help is very much appreciated.
    Thanks in Advance

  2. #2
    Trusted Penguin Cabhan's Avatar
    Join Date
    Jan 2005
    Location
    Seattle, WA, USA
    Posts
    3,230
    This would probably be easiest using a full scripting language, as opposed to awk. As a basic skeleton:
    Code:
    #!/usr/bin/ruby
    
    open "/path/to/file" do |io|
        io.each_line do |line|
            line.gsub! /\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}/ { |s| nslookup(s) }
            puts line
        end
    end
    You would need to provide the nslookup function (and polish it up a bit), but this should do what you want.
    DISTRO=Arch
    Registered Linux User #388732

  3. #3
    Linux Guru
    Join Date
    May 2011
    Posts
    1,842
    There are as many ways to do that as there are languages. Personally, I'd do it in Perl (my CGI lang of choice), but basically i'd be doing the exact same thing Cabhan has suggested. Here's a quick perl respin of the above:

    Code:
    #!/usr/bin/perl
    use strict;
    use warnings;
    my $htmlfile = shift || die "give me an HTML file to parse\n";
    open(FH,'<',$htmlfile) or die "can't open '$htmlfile': $!\n";
    while(<FH>){
      chomp;
      next unless(/^.*([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}).*$/);
      &nslookup($1);
    }
    close(FH);
    sub nslookup {
      my($ip) = @_;
      system("nslookup $ip");
    }
    Pick an interpreter and give it a crack and post back if/when you run into problems.

  4. #4
    Just Joined!
    Join Date
    Jul 2011
    Posts
    3
    Thanks, I have an idea on where to start now.

    Can any one fix up the regular expression for me? Can't seem to figure it out:
    It seems to be truncating the first 2 digits of the 1st octet i.e. with IPs of 172.16.30.54 10.12.33.21 192.168.137.254 it would match and return 2.16.30.54, 0.12.33.21, and 2.168.137.254

  5. #5
    Linux Guru
    Join Date
    May 2011
    Posts
    1,842
    My bad, use this re instead:
    Code:
      next unless(/([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}).*$/);

  6. #6
    Just Joined!
    Join Date
    Jul 2011
    Posts
    3

    Solution

    I just finished developing the fully functional code. I chose to do this in Perl. I'm posting the solution just in case if anyone in the future needs something similar.
    Code:
    #!/usr/bin/perl
    # htmlResolve.pl - Resolves all IP addresses present within a HTML file to
    #              	   DNS Hostname PTR Record
    #
    ###############################################################################
    # This script resolves the IP addresses of HTML File generated by fwlogsum
    # to the DNS Hostname PTR Record
    #
    # Created by: Achim87
    #
    # History:
    # 2011-Jul-27           Version 1.0b            Initial Version
    #
    ###############################################################################
    
    my $htmlfile = shift || die "Give me a HTML File to parse!\n";		#First Argument of Command
    my $ptrName = 0;		#Holds value of PTR Record
    my $hostname = 0;		#Holds value of PTR Record in SubRoutine resolveDNS()
    my $arrCount = 0;		#Incremental counter variable for arrIP[] array
    my @arrIP;			#Declare array to hold IP addresses on each line
    
    open(HTMLDATEI,'<',$htmlfile) or die 	#"Datei '$htmlfile' konnte nicht geoffnet werden: $!\n"; #Open the HTML File
    
       print "Successfully opened HTML file $htmlfile!\n";
       print "Streaming modified lines to stdout()\n";
    
       while(<HTMLDATEI>){					#Loop through each line of the HTML File
          chomp;							#Removes newline character at end of each line
           $stringOrig = $_;				#Stores each line of HTML File for later processing
    
          while ($_ =~ /^*([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}).*$/g){ 
             $arrIP[$arrCount] = $1;		#Populates arrIP Array with all IP Addresses on same line of HTML File
             $_ =~ s/$1/anything/g; 		#Make it loop thru all IPs on line
             $arrCount++;					#Increment array counter for each occurance of IP on same line
          }
    
          foreach $ipAddress (@arrIP){		#Process the array and replace IP with Hostname PTR Record
             $ptrName=resolvedns("$ipAddress"); 	#Resolves the IP address by calling resolvedns() subroutine
             chomp($ptrName); 				#Remove newline \n at end of string
             chop($ptrName);				#Removes period (.) at end of string generated by nslookup in resolvedns()
             $stringOrig =~ s/$ipAddress/$ptrName/g; #Replace the IP with the found Hostname PTR Record
          }
          $arrCount=0; 						#Resets and reinitialize array counter for next line of HTML File
          @arrIP = (); 						#Resets and reinitialize array for next line of HTML File
          print "$stringOrig\n";			#Output the final results
       }
       close(HTMLDATEI);
    
    sub resolvedns{							#Subroutine to resolve IPs into DNS PTR Hostnames
       $hostname=`nslookup $_[0] | grep "name =" | tail -1 | cut -d" " -f3-`;	#Runs nslookup on IP and formats it
       
       if($hostname eq ''){
          #Returns the original IP without modification if PTR Record is
    	  #not available
          return $_[0];
       } else {
          #Returns the PTR record (Hostname) if it was resolved successfully
          return $hostname;
       }
    }
    Input HTML File (einfach.html)
    Code:
    [root@IPSTESTJANG htmldns]# cat einfach.html
    <p>Line 1: No IP</p>
    <p>Line 2: Single IP 142.231.112.3 BC Net </p>
    <p>Line 3: No IP</p>
    <p>Line 4: Three IPs 142.22.48.34, 67.226.141.76, 172.27.220.70</p>
    <p>Line 5: No IP</p>
    Output HTML File (einfach_resolved.html)
    Code:
    [root@IPSTESTJANG htmldns]# ./htmlResolve.pl einfach.html
    Suceessfully opened HTML file einfach.html!
    Streaming modified lines to stdout()
    <p>Line 1: No IP</p>
    <p>Line 2: Single IP wwwtest.bc.net BC Net </p>
    <p>Line 3: No IP</p>
    <p>Line 4: Three IPs windermere.vsb.bc.ca, wap01.netconnect.bchtrg-gmbh.de, 172.27.220.7</p>
    <p>Line 5: No IP</p>
    Last edited by achim87; 07-27-2011 at 07:00 PM. Reason: Adding Outputs

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
...