Sort Home | Count Sort, Quick Sort, SQL |
Source Code | History |
I changed the name from histogram sort because of an email correspondence I had with Richard Harter. Richard says he has been using the term histogram sort for years, and also discovered an in-place version in 1984. During the course of our exchanges it turned out that there were differences in our approach. At this point I still believe that my approach provides the fastest known in-place sort. I would certainly like to hear from anyone who has a faster sort.
Richard | As a note, the reason that radix sorts are potentially faster than comparison sorts is that in effect they do comparisons in parallel. Thus a radix sort with 256 (= 2**8) bins potentially requires 1/8 the number of passes as a comparison sort. There is a catch; a radix sort must make m passes where m is the record length and a comparison sort must make log2(n) passes. If the keys are unique m > log2(n) and can be considerably greater. |
Marion | I got the best results by using a radix approach for the first passes, but switching to a quick sort when the data size in a bucket got below a fixed threshold |
If you use a hybrid approach where you use a histogram sort for for sorting a large number of items, and some other sort for small numbers of items you can take advantage of both algorithms. and Richard's assumption is made invalid. There is an overhead required for code to decide which algorithm to use, but benchmarks have shown that this overhead is offset by the gains of the hybrid approach.
Richard | Each element is examined at most twice and moved at most once. |
Marion | In my method each element is examined once and moved once. It may be moved to the place where it started. On the other hand, my method requires two copies of the histogram. |
I got the best results with fairly small histograms and large numbers of items to sort. If you are sorting a million items using a 256 element histogram the overhead of copying the histogram is much less than the extra half a million load, table lookup, and compare operations required by Richards method. As the items to be sorted become less on subsequent passes, the advantage rapidly disappears, and it becomes better to switch to another sorting method.
I think it was some time in 1998 that I realized that it would be possible to use a histogram based sort algorithm. I wrote coded the algorithm and compared it with the Microsoft sort program that comes with Windows 98. It ran many times faster on large data sets. I wrote Microsoft a letter. They weren't interested in talking to me. It never occurred to me that Microsoft wouldn't use the fastest available algorithm for their sort, but I did find that out later. I applied for a patent on a hardware implementation of the idea. After about a year of part-time investigation, I found out that the idea already existed, but it was called a count sort, or a bucket sort. I was disappointed, but not too surprised, as I have been inventing things for a long time, and have often found others were there before me.
I was about to give up on the project, when I had the thought of trying to make a single-buffered, or in place version of the sort. It took about two days, and I had working code. I benchmarked the code against the fastest know single buffered algorithm, the quick sort, and it was faster. I benchmarked it against the double buffered version of the count sort, and it was faster. I searched the literature and found a reference in Knuth stating that the count sort required double buffering, or a linked list, in which case the linked list would be proportional to the size of the data set being sorted. My implementation only used the same small auxiliary data structure as the count sort.
I got a list of computer science professors at various educational institutions in Northern California, where I live, and sent them all a copy of my source code and my benchmark results. I got two replies, one asked to be take off the mailing list, and the other from Professor Fateman. I ended up having a long email correspondence with Professor Fateman. He was not encouraging.
A friend suggested that I mail a paper to the ACM. I did that. They replied that they couldn't publish the paper because it was too short, and they thought I needed to check how the algorithm worked on a multiprocessor machine. I didn't have a multiprocessor machine. I put my findings on my AOL web site and waited for some response. Meanwhile, I need to earn a living, so I started a small business writing and selling software for home embroidery machines.