Find the answer to your Linux question:
Results 1 to 4 of 4
I have a xml file with a 1 line long string I want to extract the number after "<yt:duration seconds='" and before "'/>" like the following example, I want to ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined!
    Join Date
    May 2012
    Posts
    85

    use sed to extract a matching number


    I have a xml file with a 1 line long string
    I want to extract the number after "<yt:duration seconds='" and before "'/>"
    like the following example, I want to get 131
    Code:
    .....<yt:duration seconds='131'/>....
    the "...." here means the substrings before and after "<yt:duration seconds='131'/>", which are long with different characters, numbers and marks.

    how to use sed for extracting the matching number?
    thanks

  2. #2
    Trusted Penguin Irithori's Avatar
    Join Date
    May 2009
    Location
    Munich
    Posts
    3,382
    Imho, a regex is not the right tool for xml files. It is unnecessarily complex and errorprone.

    But xml is structured by design.
    My suggestion would be to use a tool, that can handle xml natively, such as the ruby gem nokogiri
    Last edited by Irithori; 10-25-2012 at 01:55 PM.
    You must always face the curtain with a bow.

  3. #3
    Just Joined!
    Join Date
    May 2012
    Posts
    85
    I know, but for my xml file, it is ok to use sed, I think

  4. #4
    Just Joined!
    Join Date
    Aug 2011
    Posts
    51
    This works on my Mac:
    Code:
    $ cat YourTestFile.txt
    <td><....><href="URLs goes here">blah</...> <td><...><href="URLs goes here">blahblah</...> <yt:duration seconds='131'/> things fdaakda daf
    $ sed -e "s/\(.*<yt:duration seconds='\)\([0-9]\{1,\}\)\('.*\)/\2/" YourTestfile.txt
    131
    $

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •