Results 1 to 3 of 3
Hello,
I often use grep to extract lines from a file using patterns that are in another file, e.g.
grep -wf patternfile.txt infile.txt
This toy example illustrates my problem:
Infile:
...
- 09-23-2008 #1Just Joined!
- Join Date
- Sep 2008
- Location
- Copenhagen
- Posts
- 5
Grepping something out of a specific column in a file using pattern from another file
Hello,
I often use grep to extract lines from a file using patterns that are in another file, e.g.
grep -wf patternfile.txt infile.txt
This toy example illustrates my problem:
Infile:
A 100 3396 101 M
A 200 3488 100 M
B 100 3431 102 M
A 100 3454 121 F
C 200 3407 378 M
A 200 3440 400 M
Patternfile:
100
121
378
407
Here, I want grep to only look for matches in column 4 of the infile. So, the output that I want is:
A 200 3488 100 M
A 100 3454 121 F
C 200 3407 378 M
However, since grep also finds some matches in column 2 of the infile, what I get is:
A 100 3396 101 M
A 200 3488 100 M
B 100 3431 102 M
A 100 3454 121 F
C 200 3407 378 M
In this case I can get around it by, e.g.
awk '{print $3,$4,$5}' infile.txt | grep -f patternfile.txt > temp.txt
grep -f temp.txt infile.txt
However, I would much more like to be able to grep directly out of a specific column with a pattern from a file, as this workaround isn't always applicable.
I'd appreciate any suggestions
- 09-23-2008 #2
A few possibilities:
1. Modify your pattern file, use the following format:
Then use:Code:([^ ]+ ){3}100 ([^ ]+ ){3}121 ([^ ]+ ){3}378 ([^ ]+ ){3}407
2. Construct the pattern with a process subsitution (if your shell supports it):Code:grep -Ef patternfile infile
3. Use a more powerful tool:Code:% grep -Ef <(printf "([^ ]+ ){3}%s\n" $(<patternfile)) infile A 200 3488 100 M A 100 3454 121 F C 200 3407 378 M
Code:% awk 'NR==FNR{_[$1];next}$4 in _' patternfile infile A 200 3488 100 M A 100 3454 121 F C 200 3407 378 M
- 09-23-2008 #3Just Joined!
- Join Date
- Sep 2008
- Location
- Copenhagen
- Posts
- 5
Thanks radoulov! Just what I needed. I had a hunch that awk could do the job - and I think that's the best of your solutions.


Reply With Quote