Fuzzy string match regex

Hey guys,

I’m trying to implement a fuzzy string match regex that will:

  • match only whole words not sentences
  • accept typos
  • accept missing chars

I found a regex that works ok - /a[^b]*b[^c]*c/gi
It matches 'abc', 'a123 b123 c' and will match regardless of length, word boundary (white space) and chars between, but all chars must be present.

I didn’t want white space and unlimited chars between so I modified it to /a[^ ]?b[^]?c/gi
It matches 'a1bc' but not 'a12bc'

* to ? matches only one random char between.
[^a] to [^ ] matches any char not white space.

The only issue is that all chars must be present to match.

The only way I can see around the issue would be to use 2 regex and split the string every second letter to try to match both.

e.g. search term 'Matching' would become 2 regex M_t_h_n_ and _a_c_i_g.

The issue with this is that a small search term like "air" would match any i and word length would have to be exact.

If I went back to * instead of ? even a four letter regex could match any 2 chars in any word.

I’m not really seeing a way around this bar writing some monster badly performant regex.

Does anybody have any out of the box ideas? :grinning:

(Slinks away and goes hiding in the back! :snake:)

2 Likes