Find the answer to your Linux question:
Results 1 to 5 of 5
Hi All, I'm looking for some expert help on sed/script to work out the best way to transform one xml format into another however there are a few complexities around ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined!
    Join Date
    Sep 2010
    Posts
    2

    looking for expert sed/script help for translation/substitution


    Hi All,

    I'm looking for some expert help on sed/script to work out the best way to transform one xml format into another however there are a few complexities around translation.

    The extra complexities are to:
    1) take the start and stop time (YYYYMMDDHHMMSS) and convert to start time to unix time plus output the difference in seconds between both times.
    2) oid, tsid and sid are found by looking up an external file and finding the value against the channel. For example one of the lines in the file will be 2:806:27e2=channel1

    Is there any way to write piped sed commands that can do this? If not, any ideas how the script should look like?

    Thanks in advance.

    Input File
    Code:
    <programme start="20100910060000 +0100" stop="20100910061000 +0100" channel="channel1">
    <title lang="en">This is the title</title>
    <desc>This is the description</desc>
    </programme>
    Output File
    Code:
    <service oid="0002" tsid="0806" sid="27e2">
    <event id="0">
    <name lang="OFF" string="This is the title"/>
    <text lang="OFF" string="This is the description"/>
    <time start_time="1284098400" duration="600"/>
    </event>
    </service>
    Look up file for oid, tsid and sid
    Code:
    2:806:27e2=channel1
    2:756:37a3=channel2
    5:4a06:42e5=channel3

  2. #2
    Just Joined!
    Join Date
    Sep 2009
    Posts
    3
    I think this will help you.

    Code:
    [suku@host3 ~]$ VAR1=20100910060000
    [suku@host3 ~]$ echo $VAR1 | awk '{ print substr($1,1,4), substr($1,5,2), substr($1,7,2), substr($1,9,2), substr($1,11,2), substr($1,13,2) }' | awk '{ TOT = $6 + $5*60 + $4*60*60 ; print TOT }' > STARTTIME
    [suku@host3 ~]$ echo $VAR1 | awk '{ print substr($1,1,4), substr($1,5,2), substr($1,7,2), substr($1,9,2), substr($1,11,2), substr($1,13,2) }' | awk '{ TOT = $6 + $5*60 + $4*60*60 ; print TOT }' > STARTTIME.txt
    [suku@host3 ~]$ STARTTIME=`cat STARTTIME.txt`
    [suku@host3 ~]$ VAR2=20100910061000
    [suku@host3 ~]$ echo $VAR2 | awk '{ print substr($1,1,4), substr($1,5,2), substr($1,7,2), substr($1,9,2), substr($1,11,2), substr($1,13,2) }' | awk '{ TOT = $6 + $5*60 + $4*60*60 ; RESULT = TOT - "'"$STARTTIME"'"; print RESULT }'
    600

  3. #3
    Just Joined!
    Join Date
    Sep 2010
    Posts
    2
    sorry could that be used for a file with 22k records?

    thanks.

  4. $spacer_open
    $spacer_close
  5. #4
    Just Joined!
    Join Date
    Sep 2009
    Posts
    3
    make a copy of file and try. I think its possible.

  6. #5
    Linux Newbie unlimitedscolobb's Avatar
    Join Date
    Jan 2008
    Posts
    120
    Quote Originally Posted by hotbaws11 View Post
    I'm looking for some expert help on sed/script to work out the best way to transform one xml format into another however there are a few complexities around translation.
    If I were you, I wouldn't rely on bash scripting at all. I guess you could concoct a bash script to achieve the objective, but it wouldn't be really readable (and thus not easy to maintain). I'd rather use a scripting language like Python, Perl, Ruby, etc. and parse the input file with a SAX parser (Simple API for XML - Wikipedia, the free encyclopedia). In this way you will avoid having to parse the XML content and having to load the whole document in the memory.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •