Find the answer to your Linux question:
Results 1 to 4 of 4
My simple bash-script replaces --> by the HTML-entity for the right arrow. To be precise, it replaces --*> by → Until now, I used sed, for example: Code: $ flight='AMS ...
  1. #1
    Just Joined!
    Join Date
    Dec 2010
    Posts
    16

    [SOLVED] nontrivial string replacement with bash on-board means

    My simple bash-script replaces --> by the HTML-entity for the right arrow.
    To be precise, it replaces --*> by →
    Until now, I used sed, for example:

    Code:
    $ flight='AMS --> JFK'
    $ echo "$flight" | sed -e 's/ --*> / \→ /g'
    AMS → JFK
    With sed, -* matches zero or more dashes, because for sed the * is the Kleene Star matching zero or more instances of the previous element. So with sed, --*> matches exactly what I want:

    Code:
    ->
    -->
    --->
    ---->
    Because sed seems oversized for that task, I played around to accomplish the same with on-board means of bash without using sed. On first thought, this line looks like doing the same but it isn't:

    Code:
    $ flight='AMS --> JFK'
    $ echo "${flight// --*> / → }"
    AMS → JFK
    As I recently learnt in this forum, this * isn't the Kleene Star. It is a multi-character wildcard matching any zero or more consecutive characters. So here, --*> matches:

    Code:
    ->
    -->
    --->
    ---->
    -<>
    ->>
    -abc>
    And if you continue your flight to SEA, then the result is totally wrong because the * matches greedily:
    Code:
    $ flight='AMS --> JFK --> SEA'
    $ echo "${flight// --*> / &rarr; }"
    AMS &rarr; SEA
    The sed line above would produce the correct result and would match the dashes correctly.
    Any bright idea how to accomplish this in a ligthweight manner, preferably with onboard means of bash, is very much welcome.

  2. #2
    Trusted Penguin Cabhan's Avatar
    Join Date
    Jan 2005
    Location
    Seattle, WA, USA
    Posts
    3,230
    Bash supports a number of additional pattern operators if you enable the extglob option. One of these is the Kleene star [*(...)]. See the "Pattern Matching" section of the bash man page.
    DISTRO=Arch
    Registered Linux User #388732

  3. #3
    Linux Newbie tetsujin's Avatar
    Join Date
    Oct 2008
    Posts
    115
    Quote Originally Posted by Lahntaler View Post
    My simple bash-script replaces --> by the HTML-entity for the right arrow.
    Be careful here: HTML comments are terminated (usually) with "-->". I don't know if you're ever going to have HTML comments (or, in fact, any HTML tags at all) in your input... But if you are, then this requires more advanced parsing.

    Instead of messing around with BASH globbing options, I would just use Bash's regex support. Bash's built-in regexes only match, they don't replace - so you'd need to run a loop in order to build a new string:

    Code:
    #!/bin/bash
    
    INPUT="$1"
    RESULT=""
    while [[ "$INPUT" =~ (.*[^-])-+\>(.*) ]]; do
            RESULT="&rarr;${BASH_REMATCH[2]}${RESULT}"
            INPUT=${BASH_REMATCH[1]}
    done
    RESULT="${INPUT}${RESULT}"
    echo $RESULT
    It builds the result string starting from the right because the first pattern match ".*[^-]" - any string ending with a character other than dash - is greedy and so when matching "a->b->c" with that regex, $BASH_REMATCH[1] will be "a->b".

    Of course, it'd be easier to just use sed. In practice I think I'd just do that.

  4. #4
    Just Joined!
    Join Date
    Dec 2010
    Posts
    16
    Works like a charm - thanks a lot.

    It would not work, if a string begins with -->
    Luckily this never happens in my case.
    HTML-comments ending with --> also never occur in my scenario.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
...