Find the answer to your Linux question:
Results 1 to 2 of 2
I need to correct frequently-occurring typos in several html files using sed. A common one is my search pattern being interrupted by newlines. Google told me that I could use ...
  1. #1
    Just Joined!
    Join Date
    Apr 2007
    Posts
    22

    Excluding pattern-matched characters from replacement in sed

    I need to correct frequently-occurring typos in several html files using sed.

    A common one is my search pattern being interrupted by newlines. Google told me that I could use
    Code:
    sed ':a;N;$!ba;s/\n/ /g'
    to turn newlines into spaces, but this will leave messy html. What I want to do is remove every newline not preceded by a closing angle bracket.
    I tried
    Code:
    sed ':a;N;$!ba;s/[^>]\n/ /g'
    But that will remove the last letter/character as well as the newline.
    Is there any way I can keep that letter from getting chopped out?

  2. #2
    Just Joined!
    Join Date
    Apr 2007
    Posts
    22

    doh!

    After hours of searching, I finally break down and as the question myself. Of course I find the solution on my own immediately after.

    Answer: wrap the part I want to keep in parenthesis (in my cast, the non-bracket character): \([^>]\) and add /1 to the replacement text in the place I want it.
    Code:
    sed ':a;N;$!ba;s/\([^>]\)\n/\1 /g'

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
...