Results 1 to 2 of 2
Hi I'm trying to do some data correlation in linux/unix with zcat
If I do zcat 2[3,6]/*/ktraw_somecode |raw-extract-messages -t ins
I get a list of messages of the format:
2011-05-26 ...
- 05-30-2011 #1Just Joined!
- Join Date
- May 2011
- Posts
- 6
variable matching with zcat?
Hi I'm trying to do some data correlation in linux/unix with zcat
If I do zcat 2[3,6]/*/ktraw_somecode |raw-extract-messages -t ins
I get a list of messages of the format:
2011-05-26 ins r=35648634&s=464654&u=d4867498dfgdfhfdhf3465456
where there are 3 variables, r,s and u each with a unique value
and If I run zcat 2[3,6]/*/ktraw_somecode |raw-extract-messages -t apa
I get another list of messages of the format:
2011-05-26 apa s=123456&&u=d4867498dfgdfhfdhf3465456
What I need is to match the u variables for these 2 list of messages, and see how many of them match.
- 05-30-2011 #2Note:Code:
join -t "=" -1 4 -2 3 <( zcat 2[3,6]/*/ktraw_somecode | raw-extract-messages -t ins | sort) <(zcat 2[3,6]/*/ktraw_somecode | raw-extract-messages -t apa | sort)
This is a bit evil, as
- it relies on the input format to never change
- the construct of commands is fragile and redundant
- there is no error checking at all
- and depending on how many lines of output you expect, it might take some memory
I would see that oneliner more of a adhoc method, and not as a bullet proof solution.
You will probably need some awk/perl/python along with temporary files or even a small database to do it properly.You must always face the curtain with a bow.


Reply With Quote