# Heapsort with Max-Heap and Heapify - Thorough Guide and Animated Walk-Through

Let's visualize the full process of converting an array to a max-heap and sorting the heap with Heapsort.Heapsort is a comparison-based sorting algorithm that relies on maintaining a max-heap to quickly find the largest value on each iteration.

Heapsort has an `O(n log n)`

runtime, and, since sorting is performed in place, space complexity is constant `O(n)`

total - `O(1)`

auxiliary.

This visualization is a bit more complex with multiple moving parts, so make sure to pause/manually step through the code highlighter as necessary.

**Note: you will likely need to be somewhat familiar with tree data structures to follow along.*

Though heapsort can seem complex at first glance, once we break down each piece, it's actually a fairly intuitive and accessible algorithm.

## Create a max-heap first

Let's start by creating a max-heap. A max-heap is a tree data structure where the left and right child nodes are always less than the parent.

We'll get the middle element of the array that we want to sort: `length // 2 - 1`

Next, we loop in the range between this middle element to the first element and run `heapify`

on every iteration.

Then, finally, on `line 25`

we return the array once it has been converted to a max-heap.

## heap_sort

```
class Heap:
@staticmethod
def _swap(array, a, b):
array[a], array[b] = array[b], array[a]
def _heapify(self, heap: [int], i: int, length: int):
largest = i
left = i * 2 + 1
right = i * 2 + 2
if left < length and heap[left] > heap[largest]:
largest = left
if right < length and heap[right] > heap[largest]:
largest = right
if largest != i:
self._swap(heap, i, largest)
self._heapify(heap, i, length)
def create_max_heap(self, array: [int]):
length = len(array)
start = length // 2 - 1
for i in range(start, -1, -1):
self._heapify(array, i, length)
return array
def heap_sort(self, heap):
end = len(heap) - 1
for i in range(end, -1, -1):
self._swap(heap, 0, i)
self._heapify(heap, 0, i)
return heap
```

### Let's heapify our heap

Heapify essentially loops through the entire tree making sure each tree and subtree are valid heaps.

**This is a max-heap:**

6

/ \

4 5

/ \

2 1

**Note: min-heap is merely the reverse of a max-heap*

**This is NOT a heap:**

2

/ \

6 3

/ \

1 4

So in the snippet below, we start at the middle element in our unsorted array minus one. All items on the right side are leaves so we don't need to start on them:

#### Why are all leaves on the right?

A tree can be represented as a flat array. Let's look at an example:

**[ 1, 3, 5 ]**

i = 0

left = i * 2 + 1

right = i * 2 + 2

So, our sample array could also be represented as this tree:

1

/ \

3 5

## heap_sort

```
class Heap:
@staticmethod
def _swap(array, a, b):
array[a], array[b] = array[b], array[a]
def _heapify(self, heap: [int], i: int, length: int):
largest = i
left = i * 2 + 1
right = i * 2 + 2
if left < length and heap[left] > heap[largest]:
largest = left
if right < length and heap[right] > heap[largest]:
largest = right
if largest != i:
self._swap(heap, i, largest)
self._heapify(heap, i, length)
def create_max_heap(self, array: [int]):
length = len(array)
start = length // 2 - 1
for i in range(start, -1, -1):
self._heapify(array, i, length)
return array
def heap_sort(self, heap):
end = len(heap) - 1
for i in range(end, -1, -1):
self._swap(heap, 0, i)
self._heapify(heap, 0, i)
return heap
```

## Our array data

*0*4

*1*15

*2*30

*3*25

*4*16

*5*50

*6*14

*7*3

*8*1

*9*

We start our heapify function at the middle item minus 1 (left-side are all subtree roots, right-side are all leaves).

Next, we start by initializing `largest`

to the root node (`i`

) and hold reference to the `left`

and `right`

leaf positions:

## heap_sort

```
class Heap:
@staticmethod
def _swap(array, a, b):
array[a], array[b] = array[b], array[a]
def _heapify(self, heap: [int], i: int, length: int):
largest = i
left = i * 2 + 1
right = i * 2 + 2
if left < length and heap[left] > heap[largest]:
largest = left
if right < length and heap[right] > heap[largest]:
largest = right
if largest != i:
self._swap(heap, i, largest)
self._heapify(heap, i, length)
def create_max_heap(self, array: [int]):
length = len(array)
start = length // 2 - 1
for i in range(start, -1, -1):
self._heapify(array, i, length)
return array
def heap_sort(self, heap):
end = len(heap) - 1
for i in range(end, -1, -1):
self._swap(heap, 0, i)
self._heapify(heap, 0, i)
return heap
```

## array

*0*4

*1*15

*2*30

*3*25

*4*16

*5*50

*6*14

*7*3

*8*1

*9*

First iteration starts at the last possible root and we get reference to it's child leaves (in this case it only has a single child).

If the left node exists and is a bigger number than the current largest, then we go ahead and change largest to left instead of root.

Same with the right node, if the right is larger than either left or root, then we set it as the largest node.

So essentially we look at the entire subtree (containing 3 nodes; root, left, and right) and figure out which node is the largest of the 3.

## heap_sort

```
class Heap:
@staticmethod
def _swap(array, a, b):
array[a], array[b] = array[b], array[a]
def _heapify(self, heap: [int], i: int, length: int):
largest = i
left = i * 2 + 1
right = i * 2 + 2
if left < length and heap[left] > heap[largest]:
largest = left
if right < length and heap[right] > heap[largest]:
largest = right
if largest != i:
self._swap(heap, i, largest)
self._heapify(heap, i, length)
def create_max_heap(self, array: [int]):
length = len(array)
start = length // 2 - 1
for i in range(start, -1, -1):
self._heapify(array, i, length)
return array
def heap_sort(self, heap):
end = len(heap) - 1
for i in range(end, -1, -1):
self._swap(heap, 0, i)
self._heapify(heap, 0, i)
return heap
```

Now if `largest`

is not the root node (`i`

) then we need to go ahead and swap the largest node up to the root in-order to have a valid max-heap within this subtree.

## heap_sort

```
class Heap:
@staticmethod
def _swap(array, a, b):
array[a], array[b] = array[b], array[a]
def _heapify(self, heap: [int], i: int, length: int):
largest = i
left = i * 2 + 1
right = i * 2 + 2
if left < length and heap[left] > heap[largest]:
largest = left
if right < length and heap[right] > heap[largest]:
largest = right
if largest != i:
self._swap(heap, i, largest)
self._heapify(heap, i, length)
def create_max_heap(self, array: [int]):
length = len(array)
start = length // 2 - 1
for i in range(start, -1, -1):
self._heapify(array, i, length)
return array
def heap_sort(self, heap):
end = len(heap) - 1
for i in range(end, -1, -1):
self._swap(heap, 0, i)
self._heapify(heap, 0, i)
return heap
```

## Before heapify

*0*1

*1*8

*2*

Before we run heapify we have our root:6, and two child nodes (left:1 and right:8)

## After heapify

*0*1

*1*6

*2*

Once we swap the root and right nodes, we now have a proper heap

On `line 16`

we checked if `largest != i`

, this is because if there were no swaps (i.e., the root node (i) was already the largest node) since we travel from the end upwards, this means there is no reason to check any nodes further down, all nodes below this one are guaranteed to be correct.

However, if there WAS a change, a swap between another node means we need to continue checking down the path of the swapped node.

So we only *continue* "heapifying" if there was a swap, and notice we run `heapify`

AFTER the swap, which means the next iteration is going to use the swapped node as the new root node - we run `heapify`

on `i`

which, after swapping, is now a child node instead of the root.

## heap_sort

```
class Heap:
@staticmethod
def _swap(array, a, b):
array[a], array[b] = array[b], array[a]
def _heapify(self, heap: [int], i: int, length: int):
largest = i
left = i * 2 + 1
right = i * 2 + 2
if left < length and heap[left] > heap[largest]:
largest = left
if right < length and heap[right] > heap[largest]:
largest = right
if largest != i:
self._swap(heap, i, largest)
self._heapify(heap, i, length)
def create_max_heap(self, array: [int]):
length = len(array)
start = length // 2 - 1
for i in range(start, -1, -1):
self._heapify(array, i, length)
return array
def heap_sort(self, heap):
end = len(heap) - 1
for i in range(end, -1, -1):
self._swap(heap, 0, i)
self._heapify(heap, 0, i)
return heap
```

And that wraps up the max-heap part of our algorithm - once all iterations are done we will have a perfect max-heap where all child nodes are smaller than their parents.

## Breaking down the heapsort algorithm

The hardest part of this algorithm was the heapify method discussed above, as you can see below, heapsort simply utilizes the `heapify`

method again along with one more swap:

## heap_sort

```
class Heap:
@staticmethod
def _swap(array, a, b):
array[a], array[b] = array[b], array[a]
def _heapify(self, heap: [int], i: int, length: int):
largest = i
left = i * 2 + 1
right = i * 2 + 2
if left < length and heap[left] > heap[largest]:
largest = left
if right < length and heap[right] > heap[largest]:
largest = right
if largest != i:
self._swap(heap, i, largest)
self._heapify(heap, i, length)
def create_max_heap(self, array: [int]):
length = len(array)
start = length // 2 - 1
for i in range(start, -1, -1):
self._heapify(array, i, length)
return array
def heap_sort(self, heap):
end = len(heap) - 1
for i in range(end, -1, -1):
self._swap(heap, 0, i)
self._heapify(heap, 0, i)
return heap
```

We iterate through our heap, starting at the end and traveling backward.

We start by swapping the first element with the `i`

'th element. Remember, since we are using a max-heap this first element is the guaranteed highest number in our heap, so we swap it with the last number in our heap.

## After swap

*0*30

*1*16

*2*14

*3*25

*4*6

*5*15

*6*4

*7*3

*8*50

*9*

After swapping with first and last item, we have 1 number completely sorted (`50`

). However, our max-heap is no longer valid and we need to run heapify again.

However now our heap is no longer a max-heap, so we need to run heapify again to get our heap back in order.

Notice that we change the heapify `length`

argument to `i`

on each iteration. This allows heapify to ignore the sorted end of the array.

## heap_sort

```
class Heap:
@staticmethod
def _swap(array, a, b):
array[a], array[b] = array[b], array[a]
def _heapify(self, heap: [int], i: int, length: int):
largest = i
left = i * 2 + 1
right = i * 2 + 2
if left < length and heap[left] > heap[largest]:
largest = left
if right < length and heap[right] > heap[largest]:
largest = right
if largest != i:
self._swap(heap, i, largest)
self._heapify(heap, i, length)
def create_max_heap(self, array: [int]):
length = len(array)
start = length // 2 - 1
for i in range(start, -1, -1):
self._heapify(array, i, length)
return array
def heap_sort(self, heap):
end = len(heap) - 1
for i in range(end, -1, -1):
self._swap(heap, 0, i)
self._heapify(heap, 0, i)
return heap
```

## After heapify

*0*25

*1*16

*2*14

*3*1

*4*6

*5*15

*6*4

*7*3

*8*50

*9*

Once again, we have a proper max-heap structure. We can simply swap the first element up behind 50 now to have 2 numbers sorted.

Once we have a complete heap again, we can guarantee that the first item in our heap is once again the biggest possible number, so we swap it to the end, right behind our previous number. And now we have 2 items sorted!

## After swapping on the 2nd iteration

*0*25

*1*16

*2*14

*3*1

*4*6

*5*15

*6*4

*7*30

*8*50

*9*

We now have two items fully sorted.

We can continue iterating through until our array is fully sorted.