Results 1 to 5 of 5
Hi All,
I'm looking for some expert help on sed/script to work out the best way to transform one xml format into another however there are a few complexities around ...
- 09-10-2010 #1Just Joined!
- Join Date
- Sep 2010
- Posts
- 2
looking for expert sed/script help for translation/substitution
Hi All,
I'm looking for some expert help on sed/script to work out the best way to transform one xml format into another however there are a few complexities around translation.
The extra complexities are to:
1) take the start and stop time (YYYYMMDDHHMMSS) and convert to start time to unix time plus output the difference in seconds between both times.
2) oid, tsid and sid are found by looking up an external file and finding the value against the channel. For example one of the lines in the file will be 2:806:27e2=channel1
Is there any way to write piped sed commands that can do this? If not, any ideas how the script should look like?
Thanks in advance.
Input File
Output FileCode:<programme start="20100910060000 +0100" stop="20100910061000 +0100" channel="channel1"> <title lang="en">This is the title</title> <desc>This is the description</desc> </programme>
Look up file for oid, tsid and sidCode:<service oid="0002" tsid="0806" sid="27e2"> <event id="0"> <name lang="OFF" string="This is the title"/> <text lang="OFF" string="This is the description"/> <time start_time="1284098400" duration="600"/> </event> </service>
Code:2:806:27e2=channel1 2:756:37a3=channel2 5:4a06:42e5=channel3
- 09-11-2010 #2Just Joined!
- Join Date
- Sep 2009
- Posts
- 3
I think this will help you.
Code:[suku@host3 ~]$ VAR1=20100910060000 [suku@host3 ~]$ echo $VAR1 | awk '{ print substr($1,1,4), substr($1,5,2), substr($1,7,2), substr($1,9,2), substr($1,11,2), substr($1,13,2) }' | awk '{ TOT = $6 + $5*60 + $4*60*60 ; print TOT }' > STARTTIME [suku@host3 ~]$ echo $VAR1 | awk '{ print substr($1,1,4), substr($1,5,2), substr($1,7,2), substr($1,9,2), substr($1,11,2), substr($1,13,2) }' | awk '{ TOT = $6 + $5*60 + $4*60*60 ; print TOT }' > STARTTIME.txt [suku@host3 ~]$ STARTTIME=`cat STARTTIME.txt` [suku@host3 ~]$ VAR2=20100910061000 [suku@host3 ~]$ echo $VAR2 | awk '{ print substr($1,1,4), substr($1,5,2), substr($1,7,2), substr($1,9,2), substr($1,11,2), substr($1,13,2) }' | awk '{ TOT = $6 + $5*60 + $4*60*60 ; RESULT = TOT - "'"$STARTTIME"'"; print RESULT }' 600
- 09-12-2010 #3Just Joined!
- Join Date
- Sep 2010
- Posts
- 2
sorry could that be used for a file with 22k records?
thanks.
- 09-12-2010 #4Just Joined!
- Join Date
- Sep 2009
- Posts
- 3
make a copy of file and try. I think its possible.
- 09-12-2010 #5
If I were you, I wouldn't rely on bash scripting at all. I guess you could concoct a bash script to achieve the objective, but it wouldn't be really readable (and thus not easy to maintain). I'd rather use a scripting language like Python, Perl, Ruby, etc. and parse the input file with a SAX parser (Simple API for XML - Wikipedia, the free encyclopedia). In this way you will avoid having to parse the XML content and having to load the whole document in the memory.


Reply With Quote
