Results 1 to 4 of 4
My simple bash-script replaces --> by the HTML-entity for the right arrow.
To be precise, it replaces --*> by →
Until now, I used sed, for example:
Code:
$ flight='AMS ...
- 01-09-2011 #1Just Joined!
- Join Date
- Dec 2010
- Posts
- 16
[SOLVED] nontrivial string replacement with bash on-board means
My simple bash-script replaces --> by the HTML-entity for the right arrow.
To be precise, it replaces --*> by →
Until now, I used sed, for example:
With sed, -* matches zero or more dashes, because for sed the * is the Kleene Star matching zero or more instances of the previous element. So with sed, --*> matches exactly what I want:Code:$ flight='AMS --> JFK' $ echo "$flight" | sed -e 's/ --*> / \→ /g' AMS → JFK
Because sed seems oversized for that task, I played around to accomplish the same with on-board means of bash without using sed. On first thought, this line looks like doing the same but it isn't:Code:-> --> ---> ---->
As I recently learnt in this forum, this * isn't the Kleene Star. It is a multi-character wildcard matching any zero or more consecutive characters. So here, --*> matches:Code:$ flight='AMS --> JFK' $ echo "${flight// --*> / → }" AMS → JFK
And if you continue your flight to SEA, then the result is totally wrong because the * matches greedily:Code:-> --> ---> ----> -<> ->> -abc>
The sed line above would produce the correct result and would match the dashes correctly.Code:$ flight='AMS --> JFK --> SEA' $ echo "${flight// --*> / → }" AMS → SEA
Any bright idea how to accomplish this in a ligthweight manner, preferably with onboard means of bash, is very much welcome.
- 01-10-2011 #2
Bash supports a number of additional pattern operators if you enable the extglob option. One of these is the Kleene star [*(...)]. See the "Pattern Matching" section of the bash man page.
DISTRO=Arch
Registered Linux User #388732
- 01-10-2011 #3
Be careful here: HTML comments are terminated (usually) with "-->". I don't know if you're ever going to have HTML comments (or, in fact, any HTML tags at all) in your input... But if you are, then this requires more advanced parsing.
Instead of messing around with BASH globbing options, I would just use Bash's regex support. Bash's built-in regexes only match, they don't replace - so you'd need to run a loop in order to build a new string:
It builds the result string starting from the right because the first pattern match ".*[^-]" - any string ending with a character other than dash - is greedy and so when matching "a->b->c" with that regex, $BASH_REMATCH[1] will be "a->b".Code:#!/bin/bash INPUT="$1" RESULT="" while [[ "$INPUT" =~ (.*[^-])-+\>(.*) ]]; do RESULT="→${BASH_REMATCH[2]}${RESULT}" INPUT=${BASH_REMATCH[1]} done RESULT="${INPUT}${RESULT}" echo $RESULT
Of course, it'd be easier to just use sed. In practice I think I'd just do that.
- 01-11-2011 #4Just Joined!
- Join Date
- Dec 2010
- Posts
- 16
Works like a charm - thanks a lot.
It would not work, if a string begins with -->
Luckily this never happens in my case.
HTML-comments ending with --> also never occur in my scenario.



