Find the answer to your Linux question:
Results 1 to 6 of 6
I need a write a script that searches all specific types of files (.php/.html/. js .. etc) for a list of regular expressions. The actual purpose of the script is ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined!
    Join Date
    May 2012
    Posts
    9

    Script to search files for regex expressions


    I need a write a script that searches all specific types of files (.php/.html/.js.. etc) for a list of regular expressions. The actual purpose of the script is to detect files containing known malware.


    Here is the catch, I will be adding regular expressions to this script very often so I need to create something that gets these regular expressions from another file that I can easily add to.

    I am very, very, proficient in bash scripting, thus created the script using bash. The current script finds a list of the files using find, then loops though all of them using grep to search the files for the expression. It greps each file multiple times (once for each regular expression.) This method obviously takes a very long time and on a less important note, is quite resource intensive.

    Is there any other easier way anybody knows how to do this? I would prefer bash, since I already know that, but I would be willing to learn and recreate it in another language if it would be more efficient.

    Pretty much looking for idea, links or specific functions that may help me achieve my goal. Anyone have any ideas about this?

    Thanks in advance.

  2. #2
    Just Joined!
    Join Date
    Apr 2013
    Posts
    2
    samir, when you get a chance, you need to send me some more info on this.

  3. #3
    Just Joined!
    Join Date
    Sep 2007
    Location
    Silver Spring, MD
    Posts
    95
    I need a write a script that searches all specific types of files (.php/.html/.js.. etc) for a list of regular expressions. The actual purpose of the script is to detect files containing known malware.

    ---------------------------------------------------------------------------------
    This is what I came up with to help samirj09:

    Code:
    find /  \( -name *.html -o -name *.php -o -name *.js \) -exec grep -i "test|testing|te" {} \;
    Todd
    Last edited by tdsan; 06-19-2013 at 02:00 PM.

  4. $spacer_open
    $spacer_close
  5. #4
    Just Joined!
    Join Date
    Feb 2005
    Posts
    5

    Use egrep?

    Quote Originally Posted by tdsan View Post
    I need a write a script that searches all specific types of files (.php/.html/.js.. etc) for a list of regular expressions. The actual purpose of the script is to detect files containing known malware.

    This is what I came up with:

    [code]
    find / \( -name *.html -o -name *.php -o -name *.js \) -exec grep -i "test|testing|te" {} \;
    [code]

    Todd
    I'm not a scripting genius, but I have two suggestions for you.
    1) try using egrep. I think that it is faster than grep, but it cannot handle some of the complexities of the regexes that grep can.
    2) I think that your script will end up creating lots of processes. i.e. it launches a new grep process for each file that is found.
    If you were to use PERL or PYTHON you could have only one process launched. This one perl process could read the file names that find found from a pipe, and then scan for the regexes you specify. You should only have to launch this PERL/Python process once.
    pgmr6809

  6. #5
    drl
    drl is offline
    Linux Engineer drl's Avatar
    Join Date
    Apr 2006
    Location
    Saint Paul, MN, USA / CentOS, Debian, Slackware, {Free, Open, Net}BSD, Solaris
    Posts
    1,304
    Hi.

    Look at the -f option for grep ... cheers, drl
    Welcome - get the most out of the forum by reading forum basics and guidelines: click here.
    90% of questions can be answered by using man pages, Quick Search, Advanced Search, Google search, Wikipedia.
    We look forward to helping you with the challenge of the other 10%.
    ( Mn, 2.6.n, AMD-64 3000+, ASUS A8V Deluxe, 1 GB, SATA + IDE, Matrox G400 AGP )

  7. #6
    Just Joined!
    Join Date
    Sep 2007
    Location
    Silver Spring, MD
    Posts
    95

    Suggestions

    1) try using egrep. I think that it is faster than grep, but it cannot handle some of the complexities of the regexes that grep can.
    2) I think that your script will end up creating lots of processes. i.e. it launches a new grep process for each file that is found.
    If you were to use PERL or PYTHON you could have only one process launched. This one perl process could read the file names that find found from a pipe, and then scan for the regexes you specify. You should only have to launch this PERL/Python process once.

    ------------------------------------------------

    You have to understand, I was just chiming in on the discussion from samirj09.

    As far as the code, I have made changes to this find statement, thank you for the update.

    Code:
     find / \( -name *.html -o -name *.php -o -name *.js \) -exec egrep -i "test|testing|te" {} \;
    To pgmer6809, can you provide a perl or python script that you talked about in the prior discussion.

    Thank you in advance.

    T

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •