Results 1 to 5 of 5
Hi, I am working on transforming html code text into the .vert text format. I want to use linux utility sed. I have this regexp which should do the work: ...
- 07-29-2008 #1Just Joined!
- Join Date
- Jul 2008
- Posts
- 3
Replace space, that is not in html tags <> with new line using sed
Hi, I am working on transforming html code text into the .vert text format. I want to use linux utility sed. I have this regexp which should do the work: s/ \(?![^<>]*>\)/\n/g. I use it like this with sed: echo "you <we try> there" | sed 's/ \(?![^<>]*>\)/\n/g' ... The demanded output should be:
you
<we try>
there
But I get the same string as on input. Is the regexp wrong? Or am I using sed incorrectly? Thanks for your help.
- 07-29-2008 #2Linux Newbie
- Join Date
- Jul 2008
- Posts
- 181
You might wish to refine this a bit, but it basically works:
sed 's/</\n</g; s/>/>\n/g;' input_file
- 07-29-2008 #3Just Joined!
- Join Date
- Jul 2008
- Posts
- 3
- 07-29-2008 #4Linux Newbie
- Join Date
- Jul 2008
- Posts
- 181
So what exactly do you need? Is every word supposed to be on a line of its own, except for text enclosed in angle brackets, or what?
If you don't mind the empty lines, you could use this:
sed 's/\(<[^>]\+>\| \)/&\n/g;'
And if you do, you can use:
sed 's/\(<[^>]\+>\| \)/&\n/g; s/\n \n/\n/g'Last edited by burschik; 07-29-2008 at 01:14 PM. Reason: addition
- 07-29-2008 #5Just Joined!
- Join Date
- Jul 2008
- Posts
- 3


Reply With Quote
