Find the answer to your Linux question:
Page 2 of 2 FirstFirst 1 2
Results 11 to 18 of 18
Could I set the value of the $name variable to null and then set it the first time I come across it, then for each instance of <name> only set ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #11
    Just Joined!
    Join Date
    Apr 2013
    Posts
    10

    Could I set the value of the $name variable to null and then set it the first time I come across it, then for each instance of <name> only set the variable if it is null. Then, when I come across the </item> tag I set the $name variable to null again?

  2. #12
    Trusted Penguin Irithori's Avatar
    Join Date
    May 2009
    Location
    Munich
    Posts
    3,218
    Hmm, python can do this as well
    but bash is not the best for xml handling.

    Anyway, please forget my first ruby script. It can be done much easier and without copying xml nodes.

    Given input.xml
    Code:
    <?xml version="1.0"?>
    <pixxml version="1.1">
      <items>
        <item name="VALUE_A">
          <properties>
            <name>VALUE_B</name>
            <path>VALUE_C</path>
            <description>This is a description</description>
            <status></status>
            <approved></approved>
            <item_type></item_type>
            <created_by id="128799">
              <name>Some Dude</name>
            </created_by>
            <created_timestamp>2012-04-03T07:14:03Z</created_timestamp>
            <modified_by id="32105547">
              <name>Another Dude</name>
            </modified_by>
            <modified_timestamp>2013-04-19T00:56:02Z</modified_timestamp>
            <width>1280</width>
            <height>720</height>
            <timebase>23.976</timebase>
            <mime_type>video/quicktime</mime_type>
          </properties>
          <attributes />
          <tags />
          <notes>
            <note id="31364363">
              <created_by id="23306">
                <name>Director</name>
              </created_by>
              <created_timestamp>2012-04-03T23:09:29Z</created_timestamp>
              <modified_timestamp>2012-04-03T23:09:29Z</modified_timestamp>
              <text>Note text</text>
              <has_markup>false</has_markup>
              <start_frame>1270</start_frame>
            </note>
            <note id="31364499">
              <created_by id="23306">
                <name>Director</name>
              </created_by>
              <created_timestamp>2012-04-03T23:09:58Z</created_timestamp>
              <modified_timestamp>2012-04-03T23:09:58Z</modified_timestamp>
              <text>Note Text</text>
              <has_markup>false</has_markup>
              <start_frame>3499</start_frame>
            </note>
          </notes>
          <approvals />
        </item>
        <item name="VALUE_D">
          <properties>
            <name>VALUE_E</name>
            <path>VALUE_F</path>
            <description>This is a description</description>
            <status></status>
            <approved></approved>
            <item_type></item_type>
            <created_by id="128799">
              <name>Some Dude</name>
            </created_by>
            <created_timestamp>2012-04-03T07:14:03Z</created_timestamp>
            <modified_by id="32105547">
              <name>Another Dude</name>
            </modified_by>
            <modified_timestamp>2013-04-19T00:56:02Z</modified_timestamp>
            <width>1280</width>
            <height>720</height>
            <timebase>23.976</timebase>
            <mime_type>video/quicktime</mime_type>
          </properties>
          <attributes />
          <tags />
          <notes>
            <note id="31364363">
              <created_by id="23306">
                <name>Director</name>
              </created_by>
              <created_timestamp>2012-04-03T23:09:29Z</created_timestamp>
              <modified_timestamp>2012-04-03T23:09:29Z</modified_timestamp>
              <text>Note text</text>
              <has_markup>false</has_markup>
              <start_frame>1270</start_frame>
            </note>
            <note id="31364499">
              <created_by id="23306">
                <name>Director</name>
              </created_by>
              <created_timestamp>2012-04-03T23:09:58Z</created_timestamp>
              <modified_timestamp>2012-04-03T23:09:58Z</modified_timestamp>
              <text>Note Text</text>
              <has_markup>false</has_markup>
              <start_frame>3499</start_frame>
            </note>
          </notes>
          <approvals />
        </item>
      </items>
    </pixxml>
    pixxmltransform.rb
    Code:
    #!/usr/bin/env ruby
    
    require 'nokogiri'
    
    doc = Nokogiri::XML(File.open("input.xml")) do |in_data|
      in_data.nonet.strict.noblanks
    end
    
    doc.xpath('//pixxml/items/item').each do |item|
      attr_name = item.xpath('.//properties/name').first.text
      attr_path = item.xpath('.//properties/path').first.text
    
      item.set_attribute('name', attr_name)
      item.set_attribute('path', attr_path)
    
      item.xpath('.//properties/name').remove
      item.xpath('.//properties/path').remove
    end
    
    
    puts doc.to_xml
    Result:
    Code:
    <?xml version="1.0"?>
    <pixxml version="1.1">
      <items>
        <item name="VALUE_B" path="VALUE_C">
          <properties>
            <description>This is a description</description>
            <status/>
            <approved/>
            <item_type/>
            <created_by id="128799">
              <name>Some Dude</name>
            </created_by>
            <created_timestamp>2012-04-03T07:14:03Z</created_timestamp>
            <modified_by id="32105547">
              <name>Another Dude</name>
            </modified_by>
            <modified_timestamp>2013-04-19T00:56:02Z</modified_timestamp>
            <width>1280</width>
            <height>720</height>
            <timebase>23.976</timebase>
            <mime_type>video/quicktime</mime_type>
          </properties>
          <attributes/>
          <tags/>
          <notes>
            <note id="31364363">
              <created_by id="23306">
                <name>Director</name>
              </created_by>
              <created_timestamp>2012-04-03T23:09:29Z</created_timestamp>
              <modified_timestamp>2012-04-03T23:09:29Z</modified_timestamp>
              <text>Note text</text>
              <has_markup>false</has_markup>
              <start_frame>1270</start_frame>
            </note>
            <note id="31364499">
              <created_by id="23306">
                <name>Director</name>
              </created_by>
              <created_timestamp>2012-04-03T23:09:58Z</created_timestamp>
              <modified_timestamp>2012-04-03T23:09:58Z</modified_timestamp>
              <text>Note Text</text>
              <has_markup>false</has_markup>
              <start_frame>3499</start_frame>
            </note>
          </notes>
          <approvals/>
        </item>
        <item name="VALUE_E" path="VALUE_F">
          <properties>
            <description>This is a description</description>
            <status/>
            <approved/>
            <item_type/>
            <created_by id="128799">
              <name>Some Dude</name>
            </created_by>
            <created_timestamp>2012-04-03T07:14:03Z</created_timestamp>
            <modified_by id="32105547">
              <name>Another Dude</name>
            </modified_by>
            <modified_timestamp>2013-04-19T00:56:02Z</modified_timestamp>
            <width>1280</width>
            <height>720</height>
            <timebase>23.976</timebase>
            <mime_type>video/quicktime</mime_type>
          </properties>
          <attributes/>
          <tags/>
          <notes>
            <note id="31364363">
              <created_by id="23306">
                <name>Director</name>
              </created_by>
              <created_timestamp>2012-04-03T23:09:29Z</created_timestamp>
              <modified_timestamp>2012-04-03T23:09:29Z</modified_timestamp>
              <text>Note text</text>
              <has_markup>false</has_markup>
              <start_frame>1270</start_frame>
            </note>
            <note id="31364499">
              <created_by id="23306">
                <name>Director</name>
              </created_by>
              <created_timestamp>2012-04-03T23:09:58Z</created_timestamp>
              <modified_timestamp>2012-04-03T23:09:58Z</modified_timestamp>
              <text>Note Text</text>
              <has_markup>false</has_markup>
              <start_frame>3499</start_frame>
            </note>
          </notes>
          <approvals/>
        </item>
      </items>
    </pixxml>

    Notes:
    - These lines will take the first occurence of <properties><name> and <properties><path>
    Code:
    attr_name = item.xpath('.//properties/name').first.text
    attr_path = item.xpath('.//properties/path').first.text
    - These lines will remove *all* <properties><name> and <properties><path> elements
    Code:
    item.xpath('.//properties/name').remove
    item.xpath('.//properties/path').remove
    So if this selects the wrong elements or removes too much, you will need a more detailed matching.
    Last edited by Irithori; 04-20-2013 at 07:07 AM.
    You must always face the curtain with a bow.

  3. #13
    Just Joined!
    Join Date
    Apr 2013
    Posts
    10
    Quote Originally Posted by Irithori View Post
    Hmm, python can do this as well
    but bash is not the best for xml handling.

    Anyway, please forget my first ruby script. It can be done much easier and without copying xml nodes.
    Thanks you! I am almost there. I have expanded on this a little and it now looks like this:

    Code:
    #!/usr/bin/env ruby
    
    require 'rubygems'
    require 'nokogiri'
    
    doc = Nokogiri::XML(File.open("testhoc.xml")) do |in_data|
     in_data.nonet.strict.noblanks
    end
    
    doc.xpath('//pixxml/items/item').each do |item|
     attr_name = item.xpath('.//properties/name').first.text
     attr_path = item.xpath('.//properties/path').first.text
    
     item.set_attribute('name', attr_name)
     item.set_attribute('path', attr_path)
    
     item.xpath('.//properties/name').remove
     item.xpath('.//properties/path').remove
     item.xpath('.//notes/note/name').remove
     item.xpath('.//notes/note/created_by').remove
     item.xpath('.//notes/note/created_timestamp').remove
     item.xpath('.//notes/note/modified_timestamp').remove
     item.xpath('.//notes/note/created_timestamp').remove
     item.xpath('.//notes/note/has_markup').remove
    
    end
    
    puts doc.to_xml
    The one thing I need to do to make this work perfectly is get rid of a value in the path /notes/note. It looks like this:

    <note id="1234567">

    I need to turn that into just <note>. Can you tell me how?

    Thanks,

    Dan

  4. #14
    Trusted Penguin Irithori's Avatar
    Join Date
    May 2009
    Location
    Munich
    Posts
    3,218
    Try adding this line at the end of the loop
    Code:
    item.xpath('.//notes/note').remove_attr('id')
    You must always face the curtain with a bow.

  5. #15
    Just Joined!
    Join Date
    Apr 2013
    Posts
    10
    Perfect! Thanks so much!

    -Dan

  6. #16
    Trusted Penguin Irithori's Avatar
    Join Date
    May 2009
    Location
    Munich
    Posts
    3,218
    yw

    You might want to refactor the code a little bit to get rid of redundant parts
    Code:
      ['.//properties/name'             ,'.//properties/path'              ,
       './/notes/note/name'             ,'.//notes/note/created_by'        ,
       './/notes/note/created_timestamp','.//notes/note/modified_timestamp',
       './/notes/note/created_timestamp','.//notes/note/has_markup'        ,]
      .each do |x|
        item.xpath(x).remove
      end
      item.xpath('.//notes/note').remove_attr('id')

    Also be aware, that the script is just a skeleton.
    - There is hardcoding (e.g.: File.open("testhoc.xml") )
    - Missing configurability (parse commandline options and/or a config file)
    - Missing sanity checks: e.g: If there was no //properties/name, then this will break:
    attr_name = item.xpath('.//properties/name').first.text
    - Missing error reporting
    You must always face the curtain with a bow.

  7. #17
    Just Joined!
    Join Date
    Apr 2013
    Posts
    10
    Quote Originally Posted by Irithori View Post
    yw

    You might want to refactor the code a little bit to get rid of redundant parts
    Code:
      ['.//properties/name'             ,'.//properties/path'              ,
       './/notes/note/name'             ,'.//notes/note/created_by'        ,
       './/notes/note/created_timestamp','.//notes/note/modified_timestamp',
       './/notes/note/created_timestamp','.//notes/note/has_markup'        ,]
      .each do |x|
        item.xpath(x).remove
      end
      item.xpath('.//notes/note').remove_attr('id')

    Also be aware, that the script is just a skeleton.
    - There is hardcoding (e.g.: File.open("testhoc.xml") )
    - Missing configurability (parse commandline options and/or a config file)
    - Missing sanity checks: e.g: If there was no //properties/name, then this will break:
    attr_name = item.xpath('.//properties/name').first.text
    - Missing error reporting
    Noted. Thanks again!

  8. #18
    Trusted Penguin
    Join Date
    May 2011
    Posts
    4,307
    Quote Originally Posted by strngr12
    In other words, once I move the name and path values into the <item> tag I need to delete the lines they came from. However, all other <name> tags should remain. I just need to delete the first one after properties. There will be no more instances of <path>.
    Sorry, I was out of pocket for a bit, but Irithori has taken good care of you.

    Quote Originally Posted by Irithori
    but bash is not the best for xml handling.
    too true, heh.

    @strngr12, if and when you are satisfied, you can always mark this thread as Solved using the Thread Tools link at the top of the page.

Page 2 of 2 FirstFirst 1 2

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •