What strings does this regex matches ? (a|b)*

regepxal is showing infinite matches and shows “baaaaaaa” as a match. Doesn’t (a|b) means a or b?

  • means match 0 or more times. So, it should be matching aa,aaa,bb,bbb, , etc?
    I’m not getting how baaaa is a match. Can anyone clarify please?

The parentheses group a subexpression, so it’ll match any combo of a and b. It might help to use longer examples, like "catdogdogcatcatcatcat".match(/(cat|dog)*/).

Regular expressions as implemented in most programming languages are a bit more complicated than regular languages from linguistics… so there are a bunch more operators and such that can muddy someone’s understanding of what exactly constitutes a match.

I feel I’m not getting it. I feel I’m missing something that you have good grasp of. You choose a or b and repeat, so aaaa, bbbb should be only matches. Could you help me out?

You repeatedly choose between a and b. That’s different than choosing once and repeatedly judging the choice against the input.

You might think of it like a precedence problem in the syntax. For your aaaa, bbbb example, the choice happens first (a or b) and the repetition happens second (zero-or-more times). But it’s the opposite, the repetition happens first and the choice happens on each repetition.

Not getting it still. I guess we’ve FSM for it. Where can I read about those?

Thank you. The perfect way to think about this for me is this:

(X)*= X X X X X X

Here X=a or b

So,

a or b then a or b then a or b then…