The Lowly Programmer: Introducing: Marriage Sort

Wednesday, April 14, 2010

Introducing: Marriage Sort

Update: I've created a live visualization of this algorithm, so you can see it in action - see Marriage Sort, a Visualization.

Two weeks ago, a link showed up on Hacker News on how to (mathematically) select the best wife. Tongue firmly in cheek, of course. The key takeaway from the article is that knowing only the relative rankings of items in a sequence (and assuming you can never go back to a previous one), you can maximize your chances of selecting the maximum one by letting N/e go by and then picking the next one that's better than all those. Further, to maximize your expected value you only need to let the first √N - 1 go by. After rattling around in the back of my mind for a couple weeks, yesterday this tidbit coalesced into an actual Idea. Could I turn this selection routine into a sorting algorithm?

And thus, Marriage Sort was born. I present it here purely out of interest, not because it is especially fast - if nothing else, because it is the only sorting algorithm I know of that has an average complexity of O(n^1.5)¹.

Algorithm

The basic idea of this algorithm is to repeatedly choose the maximum element from the first √N - 1 elements of our working set, and then scan to the end looking for elements bigger than it. When one is found, swap it to the end of the working set, and decrease the working set by one. After the pass is complete, swap out the maximum element from the first √N - 1 as well, and start again. When everything is finished (√N - 1 <= 0), use insertion sort to put the array into true sorted order.

One pass of the Marriage Sort. Two elements are found and swapped to the end, followed by the maximum element from the first √N - 1.

This works because, as we have noted from the article linked above, each element bigger than the first √N - 1 is expected to be close to the largest remaining element in the array. By repeatedly taking these elements and moving them to the end, we put the array into an approximately sorted order, where elements should be reasonably close to their 'true' positions. When this is done insertion sort will do the final rearranging, moving elements the short distances to their true positions.

In pseudocode:

def marriagesort(array):
    end = array.length
    while true:
       skip = sqrt(end) - 1
       if skip <= 0: break

       # Pick the best element in the first vN - 1:
       bestPos = 0, i = 1
       while i < skip:
           if array[i] > array[bestPos]: bestPos = i
           i += 1

       # Now pull out elements >= array[bestPos], and move to the end:
       i = skip
       while i < end:
           if array[i] >= array[bestPos]:
               array.swap(i, end-1)
               end -= 1
           else:
               i += 1

       # Finally, move our best pivot element to the end
       swap(bestPos, end-1)
       end -= 1

    # Finish off with insertion sort to put the elements into true sorted order
    insertionsort(array)

Here you can see this algorithm working on a randomized array. Many thanks to Aldo Cortesi for the wonderful sorting algorithm visualization framework!

Update: Here is another visualization made using Aldo Cortesi's tools, of a 512 element array this time. If you take a close look at the insertion sort phase (the slope on the right side), you can see that it is composed out of a lot of little triangles. Each of these triangles corresponds to one pass of the marriage selection routine, and shows how far elements can be from their true sorted position.

Complexity

Some quick back-of-the-napkin calculations indicate that this algorithm takes O(n^1.5) compares in the average case, although the worst case looks to be O(n²). For the first stage (i.e. without the final insertion sort) each pass will take at most n compares and move approximately √N items to the end, requiring about √N passes. I don't know if this is a tight bound - the passes will actually speed up and require fewer compares as the working set shrinks, however I didn't want to try including that in the calculations. Similarly, after each pass all the items moved are guaranteed to be greater than all the items left, capping the distance moved items are from "home" to √N on average for the insertion sort. This holds the insertion sort to O(n^1.5) compares on average as well.

Similarly, this algorithm takes O(n^1.5) swaps in the average case (and O(n²) in the worst case). The number of swaps could certainly be decreased if a different final sort were used - the first pass only takes Θ(n) by itself - although in practice this isn't much use since the compares will still dominate the runtime.

Note that both of these graphs are log-log plots, so the important feature to look at is the slope of lines rather than the absolute position. For example, in the "Swaps" plot quicksort appears to be hovering around the n∙log(log(n)) line, however looking at the slope we can see that it is actually approximating n∙log(n), just with a better constant factor.

Note 2: The x axis of these graphs is the number of elements in the array, and the y axis is the number of comparisons/sorts required. I should have labeled these better.

Postscript

Some sample code implementing this algorithm is up on GitHub - this is the code to generate the log-log plots shown above.

As always, questions and comments are welcome!

Footnotes

Most sorting algorithms are either Θ(n²) or Θ(n∙log(n)) in the average case. This one falls between those two extremes, making it faster (asymptotically) than algorithms like insertion sort or bubble sort but slower than quicksort or merge sort.

22 comments

kbobApril 14, 2010 at 10:15 AM
Is there an error in your pseudocode? It looks like the first inner while loop should be "while i < skip" instead of "while i < end". Am I missing something?
ReplyDelete
Replies
Eric BurnettApril 14, 2010 at 10:30 AM
You are correct, my pseudocode was broken. The first inner while loop should have been i < skip, and the second one for i < end was missing entirely. Thanks for catching that!
ReplyDelete
Replies
AnonymousApril 14, 2010 at 11:19 AM
That's Python, not pseudocode! :-D
ReplyDelete
Replies
Eric BurnettApril 14, 2010 at 11:25 AM
If I call it pseudocode I can skip details like float->int casts, and I don't have to make sure it actually runs ;)
ReplyDelete
Replies
AnonymousApril 14, 2010 at 1:06 PM
What did you use to create one+pass.png?
ReplyDelete
Replies
Eric BurnettApril 14, 2010 at 1:41 PM
Photoshop. I don't have any good tools for making diagrams like that, unfortunately, so that one was made by hand.
ReplyDelete
Replies
AnonymousApril 14, 2010 at 5:52 PM
Your footnote reference takes me to log in to Blogger. What?
ReplyDelete
Replies
Eric BurnettApril 14, 2010 at 6:11 PM
Ugh, that's the blogger post interface trying to be "smart". I write the posts in html, but when I use the "Compose" interface to add the images, it has a tendency to corrupt the code I wrote. I can only hope there is a better editor coming someday.

Should be fixed now.
ReplyDelete
Replies
martin lukacApril 14, 2010 at 8:14 PM
Python is pseudo code!:)
ReplyDelete
Replies
Speaker-to-AnimalsApril 15, 2010 at 7:58 AM
This is also known as the "secretary problem" and is nothing spectacularly new. More accurate formulae are given at http://en.wikipedia.org/wiki/Secretary_problem, showing the odds of selecting the best one are far from certain and actually approach zero if you don't know the value of N initially; if N is known initially, they converge on 0.368 as N increases.
ReplyDelete
Replies
AndyApril 15, 2010 at 8:43 AM
Just for your reference, shell sort (with particular gap sequences) gives O(n^1.5) worst case.

Shell sort is probably my favourite sort, just because of the magic behind the gap sequences :)
ReplyDelete
Replies
Jeremy FincherApril 15, 2010 at 8:50 AM
It'd be interesting to compare this to shell sort , another sorting algorithm whose complexity is greater than O(n log n) but less than quadratic.
ReplyDelete
Replies
Eric BurnettApril 15, 2010 at 10:08 AM
@Andy: So it does! I had looked at Shell sort briefly and saw it was O(n log(n)^2) in the best case, but I didn't realize it had other known sequences with differing complexities. Time to read up on Shell sort I think.

@Jeremy: The most common Shell sort has complexity O(n log(n)^2), so for a rough comparison we can simply plot the two of them: http://www.wolframalpha.com/input/?i=plot+x^1.5,+x*(log(x)/log(2))^2+from+1+to+100000
It would seem these two sorts are approximately equivalent in the average case (ignoring constant factor differences) for arrays up to about 100,000 elements. Hmm...I may have to try implementing Shell sort, and see how it actually compare in practice. If I do, I'll report the results back here.
ReplyDelete
Replies
Ashutosh MehraApril 15, 2010 at 10:11 AM
Shell sort (with 2^j-1 increments) is one algorithm that has a complexity of O(n^{3/2}).
ReplyDelete
Replies
AnonymousApril 15, 2010 at 10:33 AM
How about posting an image using that "fruit salad" visualization from sortvis? I always found that a lot more informative than the spaghetti tangle one:

http://corte.si/posts/code/sortvis-fruitsalad/index.html
ReplyDelete
Replies
Eric BurnettApril 15, 2010 at 11:02 AM
@Anonymous: sure, here is a visualization with 512 items:
http://3.bp.blogspot.com/_4uUxYbT6E0Q/S8cpqJpVF1I/AAAAAAAAD7I/GdEUsBuqHQk/s1600/marriagesort.png

Note that this only shows swaps, not compares, so the time isn't as dominated by the insertion sorts as it appears.
ReplyDelete
Replies
Eric BurnettApril 15, 2010 at 6:55 PM
@Anonymous: Ok, I have made a better version of the "fruit salad" visualization and added it to the post. Thanks, that was a good idea.
ReplyDelete
Replies
Ruben BerenguelMay 4, 2010 at 5:24 PM
Brilliant work. I am constantly amazed by the creativity of programmers in their free time! If only we mathematicians knew that much about programming :)

Ruben
ReplyDelete
Replies
slabountyAugust 24, 2010 at 9:48 AM
Sorry about coming so late here, but I just saw this over on Programming Praxis (http://programmingpraxis.com/2010/08/20/marriage-sort/) where I did a Ruby version. One question though, in the picture for the one pass of the merge sort, why isn't the max of the sqrt(N)-1 the value "7" rather than "8"? Wouldn't the sqrt(10)-1 just check the first two values of the array?
ReplyDelete
Replies
slabountyAugust 24, 2010 at 10:24 AM
Ahhh... that makes sense. It's interesting how hard this is to test. The insertion sort at the end covers up a multitude of sins.

I know you put code up on GitHub and you're welcome (if it looks correct to you) to grab it and share it there also.
ReplyDelete
Replies
Bitt3r0blivionNovember 13, 2010 at 8:49 PM
Kinda cool, but not quite sure why you would want to use it. If it does run O(n^1.5) then it is no improvement. There is already merge-sort and heap-sort that run in O(nlg(n)) which is asymptotically faster than O(n^1.5). So there isn't any improvement, nor can there be. There is a lower bound on comparison based sorting algorithms of nlg(n). So it is impossible to beat merge and heap. Unless you use a non comparison based sorting algorithm like counting sort or radix sort.
ReplyDelete
Replies
Eric BurnettAugust 7, 2012 at 6:21 PM
You are quite correct. If my memory serves me right, I took artistic
liberties when making that diagram to make the swap pattern obvious,
but the partition really should be the first two elements only.

Ruby port you say? Awesome! I shall have to check it out when I get
back to a real computer.
ReplyDelete
Replies

Add comment

The Lowly Programmer