Skip to content

Latest commit

 

History

History
89 lines (65 loc) · 2.45 KB

Disjoint-Set.md

File metadata and controls

89 lines (65 loc) · 2.45 KB

Disjoint Set (Union-Find)

A disjoint-set or union-find data structure is a data structure that tracks a set of elements partitioned into a number of disjoint (non-overlapping) subsets. It supports the following operations:

  • MAKE-SET(x): creates a new set containing the single element x
  • FIND(x): returns the representative of the set containing x
  • UNION(x, y): unions the two sets which contains x and y repectively

Multiway Tree Implementation

void makeSet(int x)
{
    parent[x] = x;
}

int find(int x)
{
    while (parent[x] != x):
        x = parent[x];
    return x;
}

int union(int x, int y)
{
    int xroot = find(x);
    int yroot = find(y);
    if (xroot != yroot)
        parent[yroot] = xroot;
}

Optimization

Smart Union

  • Union by rank: attach the shorter tree to the higher tree (we use rank instead of height as path compression will change the tree height dynamicly).

  • Union by size: attach the smaller tree to the larger tree.

To store the rank/size info, we can use either a new array to record that or store the negative rank/size as the parent value of the representative element.

An example of union by rank (use a new array to store rank) is as follows:

int union(int x, int y)
{
    int xroot = find(x);
    int yroot = find(y);

    // x and y are already in the same set
    if (xroot == yroot)
        return;

    // x and y are not in same set, so we merge them
    // merge smaller-rank tree into larger-rank tree
    if (rank[xroot] < rank[yroot])
        swap(xroot, yroot);
    parent[yroot] = xroot;
    if (rank[xroot] == rank[yroot])
        ++rank[xroot];
}

Path Compression

Flatten the tree every time we make a find call for efficient future searches.

int find(int x)
{
    if (parent[x] != x)
        parent[x] = find(parent[x]);
    return parent[x];
}

Complexity

Without optimizations, the height of trees can grow unchecked as O(n), implying that Find and Union operations will take O(n) time.

The amortized time complexity is O(alpha(n)) for optimized Union and Find (optimized), where alpha(n) is the inverse Ackermann function. This function has a value alpha(n) < 5 for any value of n that can be written in this physical universe, so the disjoint-set operations take place in essentially constant time.

The space complxity is O(n).

Application

  • Keep track of connected components in a undirected graph
  • Find a minimum spanning tree of a graph (Kruskal's algorithm)