Single-thread mergesort

Let's see how all this translates into code, starting by learning how to code our own homemade mergesort:

# ms/algo/mergesort.py
def sort(v):
if len(v) <= 1:
return v
mid = len(v) // 2
v1, v2 = sort(v[:mid]), sort(v[mid:])
return merge(v1, v2)

def merge(v1, v2):
v = []
h = k = 0
len_v1, len_v2 = len(v1), len(v2)
while h < len_v1 or k < len_v2:
if k == len_v2 or (h < len_v1 and v1[h] < v2[k]):
v.append(v1[h])
h += 1
else:
v.append(v2[k])
k += 1
return v

Let's start from the sort function. First we encounter the base of the recursion, which says that if the list has 0 or 1 elements, we don't need to sort it, we can simply return it as it is. If that is not the case, then we calculate the midpoint (mid), and recursively call sort on v[:mid] and v[mid:]. I hope you are by now very familiar with the slicing syntax, but just in case you need a refresher, the first one is all elements in v up to the mid index (excluded), and the second one is all elements from mid to the end. The results of sorting them are assigned respectively to v1 and v2. Finally, we call merge, passing v1 and v2.

The logic of merge uses two pointers, h and k, to keep track of which elements in v1 and v2 we have already compared. If we find that the minimum is in v1, we append it to v, and increase h. On the other hand, if the minimum is in v2, we append it to v but increase k this time. The procedure is running in a while loop whose condition, combined with the inner if, makes sure we don't get errors due to indexes out of bounds. It's a pretty standard algorithm that you can find in many different variations on the web.

In order to make sure this code is solid, I have written a test suite that resides in the ch10/ms folder. I encourage you to check it out.

Now that we have the building blocks, let's see how we modify this to make it so that it works with an arbitrary number of parts.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset