Bloom Filters - This might be my longest tutorial!

My latest article is on Bloom Filters:

This is one of those things that started off with the goal of keeping things simple but gradually got more complex as I decided to explain more and MORE things. I think the final length of this is around 25 pages (with images), which by itself is around 5% the total length of my book on Algorithms! :scream:

1 Like

Do you have data on how many people read most articles based on the amount of words? You might need to consider making YouTube video tutorials for large articles?

I think if you are skimming for an answer, the longer tutorials aren’t going to help. If you are trying to learn something in-depth, then the longer tutorials are just what you are looking for. I do tend to optimize for the second group.

As for the videos, that is usually the plan. I do have videos for almost all of my longer articles, but they do lag by a few weeks depending on how busy I am :slight_smile:

1 Like

Great article. I had one question, though, and it probably reflects my lack of understanding. You say, " 1. Too few hash functions: The bit array will be under-utilized, leading to lower collision rates but higher false negative rates as fewer bits are set." I thought that bloom filters had no false negatives. Is it that too few hash functions also increase false positives, for the reason you state (under-utilization of bit array…too many collisions)?

1 Like

Hi @mflanagan - thanks for catching an error in what I wrote. You are not lacking understanding :slight_smile:

The correct statement should be that too few hash functions will indeed lead to a higher false positive rate.

I rewrote the explanation to be more clear and accurate.

1 Like