Find the answer to your Linux question:
Results 1 to 5 of 5
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1

    looking for expert sed/script help for translation/substitution


    Hi All,

    I'm looking for some expert help on sed/script to work out the best way to transform one xml format into another however there are a few complexities around translation.

    The extra complexities are to:
    1) take the start and stop time (YYYYMMDDHHMMSS) and convert to start time to unix time plus output the difference in seconds between both times.
    2) oid, tsid and sid are found by looking up an external file and finding the value against the channel. For example one of the lines in the file will be 2:806:27e2=channel1

    Is there any way to write piped sed commands that can do this? If not, any ideas how the script should look like?

    Thanks in advance.

    Input File
    Code:
    <programme start="20100910060000 +0100" stop="20100910061000 +0100" channel="channel1">
    <title lang="en">This is the title</title>
    <desc>This is the description</desc>
    </programme>
    Output File
    Code:
    <service oid="0002" tsid="0806" sid="27e2">
    <event id="0">
    <name lang="OFF" string="This is the title"/>
    <text lang="OFF" string="This is the description"/>
    <time start_time="1284098400" duration="600"/>
    </event>
    </service>
    Look up file for oid, tsid and sid
    Code:
    2:806:27e2=channel1
    2:756:37a3=channel2
    5:4a06:42e5=channel3

  2. #2
    I think this will help you.

    Code:
    [suku@host3 ~]$ VAR1=20100910060000
    [suku@host3 ~]$ echo $VAR1 | awk '{ print substr($1,1,4), substr($1,5,2), substr($1,7,2), substr($1,9,2), substr($1,11,2), substr($1,13,2) }' | awk '{ TOT = $6 + $5*60 + $4*60*60 ; print TOT }' > STARTTIME
    [suku@host3 ~]$ echo $VAR1 | awk '{ print substr($1,1,4), substr($1,5,2), substr($1,7,2), substr($1,9,2), substr($1,11,2), substr($1,13,2) }' | awk '{ TOT = $6 + $5*60 + $4*60*60 ; print TOT }' > STARTTIME.txt
    [suku@host3 ~]$ STARTTIME=`cat STARTTIME.txt`
    [suku@host3 ~]$ VAR2=20100910061000
    [suku@host3 ~]$ echo $VAR2 | awk '{ print substr($1,1,4), substr($1,5,2), substr($1,7,2), substr($1,9,2), substr($1,11,2), substr($1,13,2) }' | awk '{ TOT = $6 + $5*60 + $4*60*60 ; RESULT = TOT - "'"$STARTTIME"'"; print RESULT }'
    600

  3. #3
    sorry could that be used for a file with 22k records?

    thanks.

  4. $spacer_open
    $spacer_close
  5. #4
    make a copy of file and try. I think its possible.

  6. #5
    Quote Originally Posted by hotbaws11 View Post
    I'm looking for some expert help on sed/script to work out the best way to transform one xml format into another however there are a few complexities around translation.
    If I were you, I wouldn't rely on bash scripting at all. I guess you could concoct a bash script to achieve the objective, but it wouldn't be really readable (and thus not easy to maintain). I'd rather use a scripting language like Python, Perl, Ruby, etc. and parse the input file with a SAX parser (Simple API for XML - Wikipedia, the free encyclopedia). In this way you will avoid having to parse the XML content and having to load the whole document in the memory.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •