Results 1 to 2 of 2
Hi,
Can anybody help me in finding common words in files. For example if I have 3 files like:
file 1
12
54
65
67
file 2
23
54
65
...
- 09-05-2008 #1Just Joined!
- Join Date
- Sep 2008
- Posts
- 1
shell script to find common words in files
Hi,
Can anybody help me in finding common words in files. For example if I have 3 files like:
file 1
12
54
65
67
file 2
23
54
65
66
76
file 3
12
23
34
54
66
and so on...
These files have special charactersitics such as: words are integers and all are in ascending order.
what my output would like to be is:
1 file1,file2,file3
2 file1,file2
2 file1,file3
3 file2,file3
where first column refers to the number of words match. I have wrote a file to go on checking 2 files, which is given below:
#!/bin/sh
rm -f subst.txt
touch subst.txt
a=1
while [ $a -le 296 ]
do
b=`expr $a + 1`
while [ $b -le 295 ]
do
cat w$a.txt|fgrep -wf w$b.txt> tmp.txt
p=`wc -l tmp.txt | awk '{print $1}'`
echo $p
if (( $p != 0 ))
then
echo $p",w"$a",w"$b"," >>subst.txt
else
echo "null"
fi
b=`expr $b + 1`
done
a=`expr $a + 1`
done
but it would be a dumb idea to extend it with more while loops within the condtional statement , and break when a match is not found.
Can I have any bright idea than this? Frankly I'm not that good in shell scripts!
Thank you in advance.
/si-thk
- 09-07-2008 #2Linux Engineer
- Join Date
- Feb 2005
- Posts
- 1,044
If your maximum value is 296 have you thought about using indexed array? You can run through each file and store the file name in each indexed variable that it contains, then run through the second file and append the name name to any existing contents, then repeat for the third file. Then you can process the array to produce your report.


Reply With Quote