Generally, a Heap is a binary tree data structure which satisfies the Heap Property (explained below).
For simplicity, this page will talk only about balanced, max-heaps that contain numbers.
Contents
Purpose
This article or section may not have content matching Scratch Wiki editing standards. Please improve it according to Scratch Wiki:Guidelines and Scratch Wiki:Editing Conventions. (August 2019) |
A heap is a way to hold and represent numbers. A heap is an alternative to a list. Both can hold numbers, but a heap is more efficient at some operations and less efficient for others.
In Scratch, a heap is most useful in two ways: it is good at sorting a list and serving as a priority queue.
A priority queue is like a list, but the biggest item is always at the front. A priority queue is often used in path-finding, but here is an easier example that should show you what it does:
Twenty clones all want to get through a door. How much they want to do this is stored in a list, but only one can fit at a time. The clone that wants to get through the door the most should be allowed through first. After it gets through, the clone that wants it the second most should be allowed through.
If a list were used, figuring out which sprite to let through would be the same as finding the highest value in a list.
define findHighestItem set [result v] to [1] set [n v] to [1] repeat ((length of [list v]) - (1)) change [n v] by (1) if <(item (n) of [list v]) > (item (result) of [list v])> then set [result v] to (n) end end
This solution is perfectly fine, but the list must be iterated (looped) over every time that the highest element needs to be found For examples where speed is important (for example, path-finding or sorting large lists), we want to do this faster—and that can be done with a heap.
Abstractly
The idea of a heap is that the container does not need to be iterated over if slightly more work is done to keep it sorted. Below a representation of a heap. The following is basic data structure terminology:
- Every item (each circle) is called a node
- Every node that points to another one is called a child
- Every child has a parent (the node above it)
- Every parent has two children
- Every parent is bigger than its children
The last statement is called the heap property. It is the property that must be true for us to call something a heap. If every parent is not bigger than both of its children, we cannot call it a heap.
In Scratch, a list can do a number of things:
- insert a number
- delete a number
- get value at an index
In addition there are a number of things that are really useful to do:
- find the biggest item
- find the smallest item
- sort the list
- figure out if a number exists in a list
Heaps can efficiently:
- insert a number
- transform a list into a heap
- delete a number
- find a number
- find the biggest number
- find the smallest number
Implementation
Because Scratch does not provide a tree data structure, lists can be used instead to program heaps.
How a List can be a Heap
Every item in a heap can be given an x and a y position.
Y position is backwards (zero is on the top). X position is the number of nodes between the left side of the heap and the element. For example, the node with the number "6" in the heap below has an x position of 0 and a y position of 1. The number "7" in the heap below has an x position of 3 and a y position of 2.
The following formulas convert between the index of an element and it's x/y position in a heap. Note that lg(x) is the logarithm in base two (if you do not know what logarithms are, do not worry) and is equal to log(x)/log(2) and that floor(x) is the "round down" function (in Scratch it is "round(x - .5)). Also note that we must find y before we find x when we are calculating heap-position from list-position.
- Index = 2^{y} + x - 1
- y = floor(lg(Index))
- x = Index + 1 - 2^y
Note: in the heap image, the numbers represent values. In the list image, the numbers represent indices. They are NOT the same.
Insertion
To add a number into a list one would use the basic Add () to () (block). Adding a number to a heap is a bit trickier, it must satisfy the heap property that a parent is greater than its children. The "bubble up" function makes sure that new elements are inserted into the correct to satisfy this property.
"Bubble Up" uses recursion. Generally "bubble up" does this:
bubble up (element): if the element is bigger than its parent a) swap them and b) bubble up myself again
So, to add a number to a heap, one can just add the number to the bottom of a heap and "bubble" it up to where it actually belongs.
define swap(a)(b) set[store v] to (item (a) of [heap v]) replace item (a) of [heap v] with (item (b) of [heap v]) replace item (b) of [heap v] with (store) define bubble up(index) if<(index) > (1)> then set [bubble_up_parent v] to ((round(((index) - (1.5)) / (2))) + (1)) // where the parent is if<(item (bubble_up_parent) of [heap v]) < (item (index) of [heap v])> then swap(index)(bubble_up_parent) bubble up(index) end end define insert(number) add (number) to [heap v] bubble up(length of [heap v])
Create Heap
define Bubble Down set[bubble_down_value v] to (index) if<not <((2) * (index)) > (length of [list v])>> then if<(item((2) * (index)) of [queue v]) < (item (index) of [list v])> then set[bubble_down_value v] to ((2) * (index)) end if<not<(((2) * (index)) + (1)) > (length of [list v])>> then if<(item(((2) * (index)) + (1)) of [list v]) < (item (bubble_down_value) of [list v])> then set[bubble_down_value v] to (((2) * (index)) + (1)) end end if<not<(bubble_down_value) = (index)>> then Swap (bubble_down_value)(index) Bubble Down(bubble_down_value) define Make into Heap set [n v] to (length of [list v]) repeat(length of [list v]) Bubble Down(n) change [n v] by (-1)
Find Maximum
This is sort of the entire point of a heap! You instantly know which item is the biggest—it's the first one!
define findBiggestValue set[result v] to (item (1 v) of [heap v])
Deletion
There are two different types of deletion. The most first one is the removal of the biggest number. For priority queues and sorting this is the only deletion that is needed, so this is the one that will be shown first. For this one needs the opposite of "bubble up": "bubble down" To delete the biggest number it is first swapped with the last number (which is usually pretty small). Then the biggest number (which is on the end) is erased and the small number moves down the tree until the tree is a heap again.
define swap(a)(b) set[store v] to (item (a) of [queue v]) replace item (a) of [queue v] with (item (b) of [queue v]) replace item (b) of [queue v] with (store) define Bubble Down(index) set[bubble_down_value v] to (index) if<not <((2) * (index)) > (length of [queue v])>> then if<(item((2) * (index)) of [queue v]) < (item (index) of [queue v])> then set[bubble_down_value v] to ((2) * (index)) end if<not<(((2) * (index)) + (1)) > (length of [queue v])>> then if<(item(((2) * (index)) + (1)) of [queue v]) < (item (bubble_down_value) of [queue v]> then set[bubble_down_value v] to (((2) * (index)) + (1)) end end if<not<(bubble_down_value) = (index)>> then Swap (bubble_down_value)(index) Bubble Down(bubble_down_value) end define removeMax swap(1)(length of [queue v]) delete (1 v) of [queue v] Bubble Down(1)
To remove any element is very similar, but the smaller number that one replaces a number with can move either up or down:
define removeElement(index) swap(index)(length of [heap v]) remove item (last v) of [heap v] bubbleUp(index) bubbleDown(index)
Find Number
A heap isn't better or worse than a list at finding if a specific number exists. One must loop through every number (though we can ignore anything bigger than the first element)
define find(value) set[result v] to (-1) // -1 means the value is not in the array if<not<(value) > (item (1 v) of [heap v])>> then set[n v] to (0) repeat(length of [heap v]) change [n v] by (1) if<(item (n) of [heap v]) = (value)> then set [result v] to (n) stop [this script v] end end
Sorting
Even though heaps are just ways of storing numbers, the concept can be applied to the sorting-list problem. For really big lists, the sorting method below is about as fast as any of the methods in Sorting Values. Here is what we are going to do: make a list into a heap, continually remove the biggest element and put it into the sorted list.
define sort // sorts the list 'unsorted' delete (all v) of [sorted v] Make into Heap // make "unsorted" into a heap repeat until<(length of [unsorted v]) = (0)> add(item (1 v) of [unsorted v]) to [sorted v] Remove Max end
Min Heap
This article focuses on implementing a heap that always has the biggest number at the top. It is possible to construct a heap with the smallest number on the top too. This is useful if you want to sort from lowest to highest, or want a priority queue that goes from smallest to largest values. For the most part, the code all stays the same except all comparisons of element's values. In other words, every
<(item (a) of [heap v]) < (item (b) of [heap v])>
and
<(item (a) of [heap v]) > (item (b) of [heap v])>
switch. The computer then reverses the "heap property" and makes every parent smaller than its children.