Results 1 to 4 of 4
Hi I am a newbie and need you expert's help.
I got a doc root (lets say /site/mysite/docs/) where i want to execute a recursive grep on all the directories ...
- 02-04-2011 #1Just Joined!
- Join Date
- Feb 2011
- Posts
- 3
help with grep .. URGENT
Hi I am a newbie and need you expert's help.
I got a doc root (lets say /site/mysite/docs/) where i want to execute a recursive grep on all the directories and get a list of files in a file_list.txt
Now search is like this
1. Capture all files which has "<!--# ((Any Text Here)) -->"
2. Capture all files that has "<!--# ((Any Text Here)) -->" as well as "<!--#include virtual= ((Path To SSI/HTML)) -->" BOTH
3. Ignore all file that has "<!--#include virtual= ((PATH TO SSI/HTML))-->" ONLY
Can someone help ?
- 02-04-2011 #2Just Joined!
- Join Date
- Feb 2011
- Posts
- 3
I was able to get first two points done with following
find /site/mysite/docs/ -exec grep -ls '<!--#' {} \; > ssi_file_list.txt
However my boss needs to cut off files which has "<!--#include virtual= ((PATH TO SSI/HTML))-->" ONLY.
- 02-07-2011 #3Just Joined!
- Join Date
- Feb 2011
- Posts
- 3
Please help
- 02-08-2011 #4Just Joined!
- Join Date
- Feb 2011
- Posts
- 12
You may do,
And then do,Code:uname@ubuntu:~$ grep -l '<!--#' `ls /site/mysite/docs/` > 1.tmp
And finally do,Code:uname@ubuntu:~$ grep -l '<!--#[^i]' `cat 1.tmp` >> 2.tmp uname@ubuntu:~$ grep -l '<!--#include [^v]' `cat 1.tmp` >> 2.tmp
The first command will isolate the files that contain <!--# (that is, files that contain any SSI).Code:uname@ubuntu:~$ diff 1.tmp 2.tmp | grep '<' | sed 's/< //g'
The second command will further isolate the files that do not contain <!--#i (that is, files that contain SSI, but do not contain inclusions); and then the files that do not contain <!--#include v (that is, files that contain SSI inclusions, but not virtual ones).
So far, 1.tmp contains a list of files that contain any SSI, and 2.tmp contains a list of files that contain SSI other than virtual inclusions.
The last command will compare the results and output (after light formatting) files that contain SSI, but do not contain SSI other than virtual inclusions (that is, contain only virtual inclusions). That's what you want.
Note: I assume the files contain simple SSI; erroneous or over-sophisticated syntax will cause the script to fail, and taking into account these extreme cases really complicates the matter.
Also,
Code:uname@ubuntu:~$ rm 1.tmp 2.tmp


Reply With Quote