Why does this union-find count too many groups after merges?

Hey everyone, I’m working on a little connectivity checker for a grid game, and I’m trying to count how many separate groups are left after merging neighbors. The annoying part is it mostly works, but on a few inputs the group count stays too high, so I feel like I’m saving a line of code and paying for it with bad state.

class DSU {
  constructor(n) {
    this.parent = Array.from({ length: n }, (_, i) => i);
    this.rank = Array(n).fill(0);
    this.groups = n;
  }

  find(x) {
    if (this.parent[x] !== x) {
      this.parent[x] = this.find(this.parent[x]);
    }
    return this.parent[x];
  }

  union(a, b) {
    const ra = this.find(a);
    const rb = this.find(b);

    if (this.rank[ra] < this.rank[rb]) {
      this.parent[ra] = rb;
    } else if (this.rank[ra] > this.rank[rb]) {
      this.parent[rb] = ra;
    } else {
      this.parent[rb] = ra;
      this.rank[ra]++;
    }

    this.groups--;
  }
}

const dsu = new DSU(5);
dsu.union(0, 1);
dsu.union(1, 2);
dsu.union(0, 2);
console.log(dsu.groups); // I expected 3

What am I missing in this union logic that makes the group count drop even when the two items are already connected?

WaffleFries

Your union is currently doing groups-- on every call, not only on an actual merge.

Once find(a) and find(b) return the same root, those nodes are already connected, so there is no new union to count. The conceptual mistake is treating “asked to union” as the same thing as “successfully merged two components.” In a DSU, the group count should only change when one root gets attached to a different root.

The smallest fix is this:

union(a, b) {
  const ra = this.find(a);
  const rb = this.find(b);

  if (ra === rb) return;

  if (this.rank[ra] < this.rank[rb]) {
    this.parent[ra] = rb;
  } else if (this.rank[ra] > this.rank[rb]) {
    this.parent[rb] = ra;
  } else {
    this.parent[rb] = ra;
    this.rank[ra]++;
  }

  this.groups--;
}

Tiny version of the same idea:

if (ra === rb) return;

That if (ra === rb) return; guard is what protects this.groups--.

  • ra === rb detects that both items already belong to the same component.
  • Leaving this.groups-- below that check ensures the count only drops after a real root-to-root merge.

That keeps repeated neighbor merges from corrupting your state.

You can run your exact case like this:

class DSU {
  constructor(n) {
    this.parent = Array.from({ length: n }, (_, i) => i);
    this.rank = Array(n).fill(0);
    this.groups = n;
  }

  find(x) {
    if (this.parent[x] !== x) {
      this.parent[x] = this.find(this.parent[x]);
    }
    return this.parent[x];
  }

  union(a, b) {
    const ra = this.find(a);
    const rb = this.find(b);

    if (ra === rb) return;

    if (this.rank[ra] < this.rank[rb]) {
      this.parent[ra] = rb;
    } else if (this.rank[ra] > this.rank[rb]) {
      this.parent[rb] = ra;
    } else {
      this.parent[rb] = ra;
      this.rank[ra]++;
    }

    this.groups--;
  }
}

const dsu = new DSU(5);
dsu.union(0, 1);
dsu.union(1, 2);
dsu.union(0, 2);

console.log(dsu.groups);     // 3
console.log(dsu.find(0));    // 0
console.log(dsu.find(1));    // 0
console.log(dsu.find(2));    // 0

If you’re scanning a grid, this guard matters a lot because the same connection often gets visited more than once.

Hari

@HariSeldon’s repeated neighbor merge point is the one that usually bites grid code, especially if you scan both right and left or both up and down, because then the same edge gets unioned twice and groups-- goes wrong fast.

Ellen

Ellen’s both-right-and-left example is the giveaway, and one practical fix is to only union in one scan direction like right and down so duplicate edges never reach DSU in the first place.

Sora