I am reading "Insertion Sort is O(nlogn) by Michael A. Bender , Martín Farach-Colton , Miguel Mosteiro" and I don't quite understand how the algorithm works and how to implement it even with the help of Wikipedia. The following is the description of the algorithm extracted from the original article.
1) Let A be an n-element array to be sorted. These elements are inserted one at
a time in random order into a sorting array S of size (1 + ε)n.
So the first step is creating array of size (1 + ε)n. Let ε = 1, then I need to create an array with twice the size of the original array.
2) The insertions proceed in log(n) rounds as follows.
Each round doubles the number of elements inserted into S and doubles the prefix of S where elements reside.
I understand that there will be outer loop that will loop log(n) time. Each round, I need to double the number of elements from A (original array) to S array. What I don't really understand is "double the prefix of S".
3) Specifically, round ith ends when element 2i is inserted and the elements are rebalanced. Before the rebalance, the 2i elements are in the first (1 + ε)2i positions.
A rebalance moves them into the first (2 + 2ε)2i positions, spreading
the elements as evenly as possible. We call 2 + 2ε the spreading factor.
From what I understand is that for every round, we will do "rebalance". "rebalance" will uniformly spread the original element in S array so that it leaves some gap between the element. The formula to spread the element is: k = i * (1 + ε) where i is old index, and k is a new index.
4) The insertion of 2i−1 intercalated elements within round ith is performed the
brute force way: search for the target position of the element to be inserted by
binary search (amongst the 2i−1 support positions in S), and move elements
of higher rank to make room for the new element. Not all elements of higher
rank need to be moved, only those in adjacent array positions until the nearest
gap is found.
This part shows how to insert each element into S array. First, use binary search to search for where the element should belongs. Then, shift the higher rank until it hit the gap.
This is the translation of the algorithm from what I understand (where A is array to sort and array start with index of 1):
n ← length(A)
S ← array of size (1 + ε) * n
for i ← 1 to n
S[i] = null
for i ← 1 to floor(log(n) + 1)
for j ← 2i - 1 to 2i
index = binarysearch(S, A[j])
insert(S, A[j], index)
Then for insertion() function takes 3 parameters: array, item to insert, and location.
def insert(S, item, index)
if S[index] != null
tmp ← S[index]
i ← index + 1
while i <= length(S) and S[i] != null
S[index] ← item
Ad "double the prefix of S": The array (memory) is allocated once at the beginning to the size of (1 + ε) n, where n is total number of elements to be sorted. But the elements are added gradually and as they are added, they are not spread across the whole array, but only some prefix of it. When m elements are rebalanced, they are spread across first (1 + ε) m elements of the array. This is the prefix. When m doubles, so does (1 + ε) m.
Ad correctness: I see one slight mistake:
The formula to spread the element is: k = i * (1 + ε) where i is old index, and k is a new index.
The quoted description does not say what the formula is, but it can't be this one. Because this would map array of length m to length (1 + ε) m, but the description says you are mapping array of length (1 + ε) m to array of length 2 (1 + ε) m.
A simple expression would be k = 2 i where i is old index, but that would not spread the elements evenly. To spread the elements evenly, the formula is k = (2 + 2 ε) i, but i is index excluding any gaps.