Graphs everywhere — Finding the minimum swaps required to sort an array

Published in

Level Up Coding

8 min readJul 9, 2020

Graphs are one of the most fundamental concepts in Computer Science. And they play the part perfectly in solving a tremendous number of problems. In this article, we’re going to bring into the light an algorithm question that at first glance one may not even think that graphs can be of any use to drive a solution for. But first, let’s have a look at our journey today. Please kindly note, we won't go deep with the Graphs concepts in this article, rather, a shallow introduction over that and then to focus on its applications to tackle arrays algorithms questions.

Road Map

A quick introduction to Graphs — Structure, Paths, and Cycles
The problem to solve — Finding the minimum required swaps to sort an array
Coding time
Always unit test your algorithms

An Introduction to Graphs

The structure of a graph

A graph is a data structure that in its most compact form is made up of a single vertex (also called a node). Here’s a simple representation of it:

We can obviously extend this graph by adding more properties to it. For example, another connected node to this where the direction between the nodes is not important, just like the below:

Figure 2 — An undirected graph (direction does not matter)

Now, if we simply add a direction to the edge between the two nodes above, we would then turn it into a directed graph which also is important for the problem we are going to solve shortly. Here’s how it looks like:

Figure 2 — A directed graph (direction matters)

It really sounds easy so far, yet, we need to cover the Paths and Cycles of a graph so we get our backpack ready to solve our algorithm question.

2. Paths

To understand the path theoretically, we need to first have a touch on the Walks within a graph. So let’s see what that is:

2.1. Walk — A Walk is basically a sequence of edges where each edge, except the first one, starts with a node (vertex) where the previous edge ended. For example, the following graph demonstrates a walk:

Here, if we traverse through A > B > C that would give us a walk — same if we travel A > B > C > D, or even, A > B > C > D > C.

The crucial note to remember here is that a Walk could have repeated edges and that would not have any interference with the definition of a Walk.

2.2. Path — A path is simply a walk where you would never go through an edge more than once. Hence, all the edges and vertices are distinct. Going back to the previous example, A > B > C would give us a path but not
A > B > C > D > C.

2.3. Cycle — A cycle can be detected when the first and the last node on a path are the same. That would give distinct edges but not distinct nodes which would lead to a Cycle. The following figure shows us a cycle of
A > B > C > D > A.

Figure 4 — Representing a cycle for A > B > C > D > A

Up to this point, we have covered all we needed to know so we can tackle the algorithm question we have. So without any further details let’s read on to find out more about the question.

The problem to solve — Finding minimum swaps to sort an array

The problem statement

We are given an array of distinct integers which are not sorted. We are allowed to swap any of the two items of the array to eventually make it sorted in ascending order. Our task is to write a piece of an algorithm that works out the minimum number of swaps that are required to sort the array as expected.

When I faced this question for the first time, I couldn’t think of anything like graphs to be turning out quite helpful so I could drive the skeleton of a solution with. But, I started to realize taht when I started to walk myself through the problem and to also visualize it over a piece of paper which is what we want to experience in the rest of the article.

2. A walkthrough over the problem

Let’s start with an array of integers as our program input. Here it is:

Figure 5 — An array of integers as the input of the program

As it’s obvious our numbers are not placed in their correct position. So, let’s see a representation of our final array — the sorted one.

Figure 6 — The expected final result — the sorted array

Now, we need to give us some time, stare into this figure, and try to showcase some creativity so that we can find a relation between the numbers and their positions.

2.1. Finding the correct (final) positions of each number

If you look into each cell of the above figure, the sorted one, you would realize that the value of each cell would equal its final cell index, if, it is subtracted by 1. For example, the first cell holds number one; if we subtract it by 1 that would give us 0 representing the correct index (or the correct position) where 1 belongs to. The same rule applies to every other cell of the array too.

Note, it is important to do the above on the expected result. Because our main intention here is two things:

1) To find out the correct (final) position of each number within the array
2) To apply that finding against the input of the program, the unsorted array, so we can then design an algorithm to move the items into their correct positions.

Please look at the following figure that visualizes what we have just achieved.

Figure 7 — Finding out the correct position (final position) of each number in the array

2.2. Considering the final positions on the unsorted (input) array

Now that we know who should be where; let’s have a look at our input array one more time and to see how we can apply our findings against that.

It’s getting interesting actually — look at the first number: 2. It must be placed at index or cell 1. Applying the graphs knowledge from the early sections, we can denote a path from index 0 to index 1; in other words, the value of index 0 which is number 2, must be placed in index 1, hence, 0 > 1 is the path. Give it some time to think over that if needed and make sure you’re confident with this up to this point.

This should give us some inspiration to draw out the paths against our array. The following figure visualizes these paths until they reach a cycle.

Figure 9 — Solution representation; looking at the array from a graph point of view and detecting cycles in that

This figure is the actual solution to the problem. Consider that first cycle:
0 > 1 > 0. This is saying that the element at index 0 must be placed at index 1 and the element at index 1 must be placed at index 0, hence, swapping the elements will make them sorted in ascending order. The total swaps needed on this cycle would be 1. This logic applies to the second cycle too,
2 > 4 > 3 > 2 for which a total swap of 3 will make it sorted. Therefore, we can conclude that f(swaps) = number(nodes) -1. If we simply sum up the swaps for each cycle, we will end up with the minimum swaps required to get the array sorted in ascending order.

Woop! Good job — it may be worth it to go through this one more time on your own before burning the code for that if you find that helpful.

Coding time — C#

We would like to implement our solution, figure 9, in a way that we can test it unit by unit. Hence, we break our solution into two different major parts:

A method responsible to generate the cycles
A loop against the input array so that for each item we would consider the cycles

We also want to keep track of our visited nodes so we avoid computing the same subproblem over and over again, hence, optimizing the performance. For more information on this, please read my article on Dynamic Programming. Let’s see how the magic looks like.

Figure 10 — The method responsible for detecting the cycles and returning the path to each of which

The FindCyclesPaths method does one single thing against which we can easily write a unit test to prove its correctness. Needless to say that, the while loop will break once we reach a cycle which is when the first node gets equal to the destination node. For example, in Figure 9, the destination of the first index is the second cell of the array. This function returns a list of nodes denoting a cycle path as per Figure 9. The computation of the swaps is not within the boundary of this method, hence, building up testable codebase.

Now, we need to settle our main method that travels through the input array calling the FindCyclesPaths for each un-visited node.

Figure 11 — The actual method for finding the minimum required swaps

As explained before, we would like to introduce a data structure in which to store the visited nodes. We use a boolean array for this with the same length as that of the input array, here named numbers. By default, C# will populate false values for each index, hence, no need to do it manually. Next, we walk through our items and we compute the swaps for each cycle path we visited. Finally, the swaps variable gets returned as the result of our procedure.

Well done! we’ve got this implemented and HackerRank loves it too as it gives full score to this solution. Now, let’s confirm the correctness of our code using unit tests.

Always unit test your algorithm

At this step, we would write a few test cases merely to prove the basic scenarios and I would like to leave the edge cases for the readers to explore more on this. Here’s our first unit test for the cycle detection function verifying the two cycles detected in Figure 9.

Figure 12 — Unit tests for FindCyclesPath method

And finally, one for our main function:

Figure 13 — Unit test for FindMinSwaptoSortArray method

And that’s all really — we have covered few concepts of the graphs theory, we went through a medium level algorithm questions, looked at an array sorting problem from graphs perspective, a quick touch on DP Programming, and eventually proved the correctness with some unit tests.

Thanks for reading this through and happy coding! 🐱‍🏍