Find the answer to your Linux question:
Results 1 to 5 of 5
Hi all, I've got a text that need to have each of the following pattern (%1, %2, %3) only once but they can be in any order and we can ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined!
    Join Date
    Nov 2004
    Location
    France -> Angers
    Posts
    4

    Regular expression to see if text has required elements in any order


    Hi all, I've got a text that need to have each of the following pattern (%1, %2, %3) only once but they can be in any order and we can put any character between them....
    e.g.:
    Hello %1, you are a %2 and live in %3 -> is correct
    There is 1234 %2 in %3 and only one named %1 -> is correct
    %3 is a great town, is there a lot of %2 there? -> is not correct (missing %1)
    Hello %1, my name is also %1, I live in %3 but I'm not a %2-> is not correct (2x %1)

    I want to avoid doing this:
    .*(%1.*%2.*%3|%3.*%2.*%1|%2.*%1.*%3|....).*

    Because my next regexp has 6 elements to check.....

    Thanks

  2. #2
    Linux Enthusiast
    Join Date
    Jan 2005
    Posts
    575
    Hmmmm that's a tricky one.I very much doubt it can be done just using regular
    expressions.You would need to use some other programming construct
    combined with regular expressions.

    The following is not correct.
    .*(%1.*%2.*%3|%3.*%2.*%1|%2.*%1.*%3|....).*
    It will match strings where each of the patterns can exist more than once.

    By the way , are the patterns allowed to overlap ? From the way you're phrasing
    things I would guess not but can you clarify ?

  3. #3
    Just Joined!
    Join Date
    Nov 2004
    Location
    France -> Angers
    Posts
    4
    Quote Originally Posted by Santa's little helper
    Hmmmm that's a tricky one.I very much doubt it can be done just using regular
    expressions.You would need to use some other programming construct
    combined with regular expressions.

    The following is not correct.
    .*(%1.*%2.*%3|%3.*%2.*%1|%2.*%1.*%3|....).*
    It will match strings where each of the patterns can exist more than once.

    By the way , are the patterns allowed to overlap ? From the way you're phrasing
    things I would guess not but can you clarify ?
    You right this isn't even correct, I'm not sure I understand what you mean if the patterns are allowed to overlap?

  4. #4
    Linux Enthusiast
    Join Date
    Jan 2005
    Posts
    575
    I didn't express it well.The correct thing to ask is whether the
    portions of the string which match each pattern are allowed to overlap.
    So for example assume that pattern 1 is ab , pattern 2 is bc
    and pattern 3 is cd. If you take the string abcd then it
    matches all 3 patterns exactly 1 time.But some parts of the string
    have to be used more than once to match the patterns.So for
    example the letter b inside the string has to be used to match
    pattern 1 and pattern 2.Is this ok or do you only want strings
    like abbccd where different parts of the string have to
    be used to match each pattern ?

  5. #5
    Linux User
    Join Date
    Jul 2004
    Location
    Poland
    Posts
    368
    This is really tricky. The problem is that you have to remeber what has been matched and not match it again. I tried to embed \<digit> in expressions like [^\1] but it did not seem to work. After thinking a little, I decided to use look-ahead patterns to check (after we hit %<digit>) if this number doesn't repeat again. Here's snippet:
    Code:
    ^&#91;^%&#93;*%&#40;&#91;1-3&#93;&#41;&#40;?!.*%\1&#41;&#91;^%&#93;*%&#40;&#91;1-3&#93;&#41;&#40;?!.*%&#40;\1|\2&#41;&#41;&#91;^%&#93;*%&#40;&#91;1-3&#93;&#41;&#91;^%&#93;*$
    It's little unhandy but it passes the cases you provided.
    "I don't know what I'm running from
    And I don't know where I'm running to
    There's something deep and strange inside of me I see"

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •