Results 1 to 9 of 9
hi,
i want to filter out lines that contain more than 49% of specific character.
for example:
line 1 : QWERWWWRWT
line 2 : QWERTYUIPPP
After:
line 2 only, because ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
- 01-21-2013 #1Just Joined!
- Join Date
- Jul 2011
- Posts
- 18
filtering out some strings
hi,
i want to filter out lines that contain more than 49% of specific character.
for example:
line 1 : QWERWWWRWT
line 2 : QWERTYUIPPP
After:
line 2 only, because in line 1 i had 5/10 times "W".
is there an easy way by bash? sed or awk?
Thanks!
Pap
- 01-21-2013 #2
awk can definitely do the job. However, I have neither the time nore the knowledge to give you ready-to-rumble code snippets. Work your ways through awk tutorials on how to count characters with it. Mostly you'll have to count all "W" characters in one variable and the rest in another variable and compare the outcome on a line feed. A good start will be this site.
- 01-21-2013 #3Linux Newbie
- Join Date
- Nov 2012
- Posts
- 136
hi,
it's probably easier the other way:
- keep line in a variable
- substitute "W" with nothing in kept line
- get length of line and of modified kept line
- do line's length minus modified line's length and do the percentage
- if it's less than 49, then print line
use awk, because bash can't do float arithmetics.
- 01-21-2013 #4
gsub should be able to replace every W (gsub("[W]", "")) and every not-W (gsub("[^W]", ""). Comparing the length of both resulting strings should then be enough - no fancy arithmatics needed.
More or less something alike this:
Code:line_onlychar=gsub("[W]", ""); line_nochar=gsub("[^W]", ""); if(length(line_nochar) > length(line_onlychar)) { # matched }
- 01-21-2013 #5Linux Newbie
- Join Date
- Nov 2012
- Posts
- 136
yes, gsub() returns the number of substitutions
«49» makes me think that it may be any "ratio", not only 50-50.Code:awk '{ keep=$0 Ws=gsub("W","",keep) if(length(keep)<Ws)print }'
- 01-21-2013 #6Just Joined!
- Join Date
- Jul 2011
- Posts
- 18
- 01-22-2013 #7Just Joined!
- Join Date
- Jul 2011
- Posts
- 18
- 01-22-2013 #8Linux Newbie
- Join Date
- Nov 2012
- Posts
- 136
replace the if statement with this one
Code:if( (length(keep)/length($0))*100 >= 60 )print
- 01-22-2013 #9Just Joined!
- Join Date
- Jul 2011
- Posts
- 18


Reply With Quote

