----------------------------------------------------------------------------
到如今爲止,有一個長度爲N的輸入隊列和一個計算好的minrun,算法流程以下:
-------------------------------------------------------------------------------------------------
到目前爲止,input隊列被分爲runs。如該輸入隊列接近隨機,run大小都會接近minrun,如該數據是範圍有序,run大小超過minrun。如今runs須要被結合來完成排序,固然,兩個要求要知足:
------------------------------------------------------------------------------------------------------------------------------------------
Modifications to the Merging Sort
---------------------------------------------------------------------------------------------------------------------------------------------
基本方法是,若是發現一個隊列比另外一個大好多,那麼切換模式,進入galloping模式,批量移動。
All seems perfect in the above merge sort. Except for one thing: imagine the merge of such two arrays:算法
A = {1, 2, 3,..., 9999, 10000}less
B = { 20000, 20001, ...., 29999, 30000}this
The above procedure will work for them too, but at each step four, one comparison and one moving should be performed to give 10000 comparisons and 10000 moves. Timsort offers here a modification called galloping. It means the following:指針
- Start the merge sort as described above.
- At each moving of an element from the temporary or large run to the final one, the run where the element was moved from is recorded.
- If a number of elements (in this representation of the algorithm this number equals 7) was moved from one and the same run, it can be assumed that the next element will also come from the same run. To prove it, the galloping mode is switched, i.e. go through the run that is expected to supply the next set of data with a binary search (remember that the array is ordered and we have all rights for binary search) to find the current element from the other merged run. The binary search is more efficient than the linear one, thus the number of searches will be much smaller.
- Finally, when the data in the supplying run do not fit (or having reached the run end), this data can be moved in a bulk (which can be more efficient than moving separate elements).
The explanation can be a bit vague, so let’s have a look at the example: A = {1, 2, 3,..., 9999, 10000} B = { 20000, 20001, ...., 29999, 30000}orm
- In the first seven iterations, numbers 1, 2, 3, 4, 5, 6 and 7 from run A are compared with number 20000 and, after 20000 is found to be greater, they are moved from array A to the final one.
- Starting with the next iteration, the galloping mode is switched on: number 20000 is compared in sequence with numbers 8, 10, 14, 22, 38, n+2^i, ..., 10000 from run A. As you can see, the number of such comparisons will be far less than 10000.
- When run A is empty, it is known that it is smaller than run B (we could also have stopped somewhere in the middle). The data from run A is moved to the final one, and the procedure is repeated.