Results 1 to 7 of 7
Hi all - I have been working on a sed script to pull some information out of an html file.
The original file is like this:
...
<span class="text1">50mL $210<br>100mL ...
- 04-05-2007 #1Just Joined!
- Join Date
- Apr 2007
- Posts
- 2
Problems with sed matches
Hi all - I have been working on a sed script to pull some information out of an html file.
The original file is like this:
...
<span class="text1">50mL $210<br>100mL $340<br>250mL $690<br>500mL $1190<BR></span>
...
I am trying to *just* get the prices so I do the following command:
sed -n 's/.*100mL \(.*\)<br>.*/\1/p;T;q;' <filename>
and I thought it would give me: $340
but instead I get:
$340<br>250mL $690
How can I get it to stop on the first match of <br>?
I've looked all over and haven't found a way to do this - any help would be great!
Thanks.
- 04-06-2007 #2
It might be worthwhile to simplify it first... Try each of these commands to see what I mean:
Code:sed 's/<br>/ /gi' <filename> sed 's/<br>/ /gi' <filename> | sed 's.</span>..' sed 's/<br>/ /gi' <filename> | sed 's.</span>..' | sed 's/<.*>//'
- 04-06-2007 #3
Try this:
RegardsCode:sed -n "s/.*100mL \(\\$\)\([0-9]\+\).*/\1\2/p" <filename>
- 04-06-2007 #4Just Joined!
- Join Date
- Apr 2007
- Posts
- 18
This gets you all the prices line by line:
Without 'cut' you get the pairs like:Code:sed -e 's/[><]/\n/g' <filename> | sed -e '/\$/!d' | cut -f2 -d' '
50mL $210
100mL $340
250mL $690
500mL $1190
With sed as I guess the option 'g' forces to continue searching for next occurrence of regexp, without 'g' the script stops at the first one.
Cheers
- 04-06-2007 #5Just Joined!
- Join Date
- Apr 2007
- Posts
- 2
Thanks Rtoip and CodeRoot!
I ended up using
sed -e 's/[<>]/\n/g' 2001|sed -e '/\'mL'/!d'
and it returned
50mL $210
100mL $340
250mL $690
500mL $1190
There was some other cruft in the file with $ that kept getting into the result, but the mL seemed to be unique. Now I can use awk to sort through the results.
Birdman - if I try the line you gave, I get something that looks right, but if I use 50mL, it returns the price for 250mL - I'm guessing because that is the last instance of "50mL" on that line...
Thanks everyone.
- 04-06-2007 #6
[/E*Fare/Users/qabuild/aksutil/P/pawan] $ echo '<span class="text1">50mL $210<br>100mL $340<br>250mL $690<br>500mL $1190<BR></span>
'| grep -o "[0-9]*mL \$[0-9]*"
50mL $210
100mL $340
250mL $690
500mL $1190
Just by using GREP!!!
/A
Originally Posted by penguinpower80
- 04-06-2007 #7Just Joined!
- Join Date
- Apr 2007
- Posts
- 18
Sangal fine, there are many utilities which will lead to the same result, like nearly always


Reply With Quote