Find the answer to your Linux question:
Results 1 to 6 of 6
I have a file something like this <html><head><meta http-equiv="content-type" content="text/html; charset=UTF-8"><title>Google</title><style>body,td,a,p,.h{font-family:arial,sans-serif}.h{font-size:20px}.h{color:#3366cc}.q{color:#00c}.ts td{padding:0}.ts{border-collapse:collapse}#gbar{float:left;font-weight:bold;height:22px;padding-left:2px}#gbh{border-top:1px solid #c9d7f1;font-size:0;height:0;position:absolute;right:0;top:24px ;width:200%}#gbi{background:#fff;border:1px solid;border-color:#c9d7f1 #36c #36c #a2bae7 .............................................. I want to add newline before each html beginning tag ...
  1. #1
    Just Joined!
    Join Date
    Jan 2008
    Posts
    1

    help with sed

    I have a file something like this
    <html><head><meta http-equiv="content-type" content="text/html; charset=UTF-8"><title>Google</title><style>body,td,a,p,.h{font-family:arial,sans-serif}.h{font-size:20px}.h{color:#3366cc}.q{color:#00c}.ts td{padding:0}.ts{border-collapse:collapse}#gbar{float:left;font-weight:bold;height:22px;padding-left:2px}#gbh{border-top:1px solid #c9d7f1;font-size:0;height:0;position:absolute;right:0;top:24px ;width:200%}#gbi{background:#fff;border:1px solid;border-color:#c9d7f1 #36c #36c #a2bae7
    ..............................................
    I want to add newline before each html beginning tag and after each html ending tag, so it would look like this :
    <html>
    <head>
    <meta ......>..........</meta>
    Any ideas how to do this using sed?

  2. #2
    Linux User
    Join Date
    Aug 2006
    Posts
    458
    Code:
    awk '{ sub(/<html>/,"<html>\n"); 
           sub(/<\/html>/,"<\/html>\n");
           print}' file
    it will be similar using sed.

  3. #3
    Linux Engineer wje_lf's Avatar
    Join Date
    Sep 2007
    Location
    Mariposa
    Posts
    1,192
    As kern_eye made clear with his example, he wants to do this with all HTML tags, not just <HTML> tags.

    Here ya go:
    Code:
    sed -e 's/\(<[a-zA-Z]\)/\
    \1/g' -e 's/\(<\/[^>]*>\)/\1\
    /g' 1.dat
    Be sure that a newline immediately follows the backslash you see at the end of each of the first two lines. If you're putting this in a script, you ought to be able to just cut and paste. If you're typing this at the bash prompt, the opening single quote following each -e will signal to bash that you're not done with the command, and bash will let you continue entering more of the command on the next line.

    Hope this helps.
    --
    Bill

    Old age and treachery will overcome youth and skill.

  4. #4
    scm
    scm is offline
    Linux Engineer
    Join Date
    Feb 2005
    Posts
    1,044
    Since you're enclosing the entire matched pattern each time, you could avoid using the \( and \) and use & in the replacement string instead of \1.

  5. #5
    Linux Engineer wje_lf's Avatar
    Join Date
    Sep 2007
    Location
    Mariposa
    Posts
    1,192
    I didn't know that. Thank you. It works even on my ancient version of sed (3.02).
    --
    Bill

    Old age and treachery will overcome youth and skill.

  6. #6
    scm
    scm is offline
    Linux Engineer
    Join Date
    Feb 2005
    Posts
    1,044
    It's been in every version since (to my knowledge) the 1980s, I believe, so you'd need a version that predates that to find one that didn't support it! The \( \) was presumably added subsequently to allow replacement of partially matched patterns.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
...