This sorting library gives programmers an easy way to increase
sorting performance by harnessing the highly parallel computational
power available in todays graphics cards.
Tests shows that it can outperform highly optimized CPU-based
Quicksort with a factor of 10 on cards commonly available in 2007.
Sorting 16 million floating point numbers or integers take less
than half a second!
Benchmarks
This graph shows the performance when sorting an array of uniformly distributed floating point numbers on a 8800GTX graphics card.
The comparison is made between the GPU Quicksort Library, Radix sort, Radix/Merge sort, GPUSort (bitonic sort) and the Introsort
(a combination of Quicksort and Heapsort) algorithm in the standard C++ library (on a Opteron 265 1.8GHz). More results are available in the technical report [1].
Download
The source code of the GPU Quicksort Library is released under a BSD license, see source for details.
It is bundled with a testbench for evaluation. It contains a Makefile for Linux and Visual Studio project files for Windows.
Version 1.1 - Released 2008-11-10
Version 1.0 - Released 2007-12-15
To be able to use the GPU Quicksort Library you need to have a CUDA capable graphics card from NVIDIA, display drivers with CUDA support (169.09 and above on Windows, 169.01 and above on Linux) and the CUDA toolkit. If you want to compile the library yourself, you also need to have the CUDA SDK. Everything can be downloaded from the official CUDA website.
Presentation
This presentation was given at the Annual European Symposium on Algorithms in 2008.
Papers
[1] Daniel Cederman and Philippas Tsigas, A Practical Quicksort Algorithm
for Graphics Processors,Technical Report 2008-01, Chalmers University of Technology,
January 2008. [pdf]
[2] Daniel Cederman and Philippas Tsigas, A Practical Quicksort Algorithm for Graphics Processors. In the Proceedings of the 16th Annual European Symposium on Algorithms (ESA 2008), Lecture Notes in Computer Science Vol.: 5193, pages 246 - 258, Springer-Verlag 2008. [pdf]
[3] Daniel Cederman and Philippas Tsigas, On sorting and load balancing on GPUs.
In the ACM SIGARCH Computer Architecture News, Vol. 36, No. 5, pages: 11-18, ACM press 2009.
[doi]
[4] Daniel Cederman and Philippas Tsigas, GPU-Quicksort: A Practical Quicksort Algorithm for Graphics Processors.
In the ACM Journal of Experimental Algorithmics (JEA), Vol. 14, No. 4, ACM press 2009. [doi]
|