126. Union Find

The Union Find algorithm operates on a disjoint-set data structure.

A disjoint-set data structure defines sets of elements separated in certain disjoint subsets that are not overlapping. Graphically, we can represent a disjoint-set with three subsets, as in the following diagram:

In the code, a disjoint-set is represented as follows:

  • n is the total number of elements (for example, in the preceding diagram, n is 11).
  • rank is an array initialized with 0 that is useful to decide how to union two subsets with multiple elements (subsets with lower rank become children of subsets with a higher rank).
  • parent is the array that allows us to build an array-based Union Find (initially, parent[0] = 0; parent[1] = 1; ... parent[10] = 10;):
public DisjointSet(int n) {

this.n = n;
rank = new int[n];
parent = new int[n];

initializeDisjointSet();
}

Mainly, the Union Find algorithms should be capable of the following:

  • Merging two subsets into a single subset
  • Returning its subset for the given element (this is useful for finding elements that are in the same subset)

In order to store a disjoint-set data structure in memory, we can represent it as an array. Initially, at each index of the array, we store that index (x[i] = i). Each index can be mapped to a piece of meaningful information for us, but this is not mandatory. For example, such an array can be shaped as in the following diagram (initially, we have 11 subsets and each element is a parent of itself):

Or, if we use numbers, we can represent it in the following diagram:

In terms of code lines, we have the following:

private void initializeDisjointSet() {

for (int i = 0; i < n; i++) {
parent[i] = i;
}
}

Furthermore, we need to define our subsets via the union operation. We can define subsets via a sequence of (parent, child) pairs. For example, let's define the following three pairs—union(0,1);union(4, 9);, and union(6, 5);. Each time an element (subset) becomes a child of another element (subset), it will modify its value to reflect the value of its parent, as in the following diagram:

This process continues until we define all our subsets. For example, we can add more unions—union(0, 7);, union(4, 3);union(4, 2);, union(6, 10);, and union(4, 5);. This will result in the following graphical representation:

As a rule of thumb, it is advisable to union smaller subsets to larger subsets and not vice versa. For example, check the moment when we unify the subset that contains 4 with the subset that contains 5. At that moment, 4 is the parent of the subset and it has three children (2, 3, and 9), while 5 is next to 10, the two children of 6. So, the subset that contains 5 has three nodes (6, 5, 10), while the subset that contains 4 has four nodes (4, 2, 3, 9). So, 4 becomes the parent of 6 and, implicitly, the parent of 5.

In code lines, this is the job of the rank[] array:

Let's now take a look at how to implement the find and union operation. 

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset