Tim Sort Algorithm

Tim Sort is a hybrid sorting algorithm derived from Merge Sort and Insertion Sort. It was designed to perform well on many kinds of real-world data. Tim Peters, a Python developer, implemented it in 2002 for the Python programming language’s sort method. Since then, it has been the default sorting algorithm in Python’s sorted() function and list.sort() method.

Tim Sort is particularly efficient because it takes advantage of existing order in the data. It identifies small, nearly sorted sequences (runs) and sorts them using Insertion Sort. Then, it merges these runs using a merge process similar to Merge Sort. This combination allows Tim Sort to be very efficient for practical data sets that often have some existing order.

Key Characteristics of Tim Sort

  • Stability: Tim Sort is stable, meaning it preserves the relative order of equal elements.
  • Adaptive: It adapts to the existing order in the data, making it efficient for partially sorted data.
  • Worst-case Time Complexity: O(n log n)
  • Best-case Time Complexity: O(n) for already sorted data
  • Space Complexity: O(n)

Tim Sort Algorithm Implementation in Python

Here’s an example of how Tim Sort can be implemented in Python:

def insertion_sort(arr, left, right):
    for i in range(left + 1, right + 1):
        key = arr[i]
        j = i - 1
        while j >= left and arr[j] > key:
            arr[j + 1] = arr[j]
            j -= 1
        arr[j + 1] = key

def merge(arr, l, m, r):
    len1, len2 = m - l + 1, r - m
    left = arr[l:l + len1]
    right = arr[m + 1:m + 1 + len2]

    i, j, k = 0, 0, l

    while i < len1 and j < len2:
        if left[i] <= right[j]:
            arr[k] = left[i]
            i += 1
        else:
            arr[k] = right[j]
            j += 1
        k += 1

    while i < len1:
        arr[k] = left[i]
        i += 1
        k += 1

    while j < len2:
        arr[k] = right[j]
        j += 1
        k += 1

def tim_sort(arr):
    MIN_MERGE = 32

    def min_run_length(n):
        r = 0
        while n >= MIN_MERGE:
            r |= n & 1
            n >>= 1
        return n + r

    n = len(arr)
    min_run = min_run_length(n)

    for start in range(0, n, min_run):
        end = min(start + min_run - 1, n - 1)
        insertion_sort(arr, start, end)

    size = min_run
    while size < n:
        for left in range(0, n, 2 * size):
            mid = min(n - 1, left + size - 1)
            right = min((left + 2 * size - 1), (n - 1))

            if mid < right:
                merge(arr, left, mid, right)

        size = 2 * size

    return arr

# Example usage
arr = [5, 21, 7, 23, 19]
sorted_arr = tim_sort(arr)
print("Sorted array is:", sorted_arr)

Code Explanation

Insertion Sort Function

def insertion_sort(arr, left, right):
    for i in range(left + 1, right + 1):
        key = arr[i]
        j = i - 1
        while j >= left and arr[j] > key:
            arr[j + 1] = arr[j]
            j -= 1
        arr[j + 1] = key
  • The insertion_sort function is used to sort small subarrays (runs).
  • It sorts the elements from index left to right using the Insertion Sort algorithm.

Merge Function

def merge(arr, l, m, r):
    len1, len2 = m - l + 1, r - m
    left = arr[l:l + len1]
    right = arr[m + 1:m + 1 + len2]

    i, j, k = 0, 0, l

    while i < len1 and j < len2:
        if left[i] <= right[j]:
            arr[k] = left[i]
            i += 1
        else:
            arr[k] = right[j]
            j += 1
        k += 1

    while i < len1:
        arr[k] = left[i]
        i += 1
        k += 1

    while j < len2:
        arr[k] = right[j]
        j += 1
        k += 1
  • The merge function merges two sorted subarrays into a single sorted subarray.
  • left and right are temporary arrays to hold the elements of the subarrays.

Tim Sort Function

def tim_sort(arr):
    MIN_MERGE = 32

    def min_run_length(n):
        r = 0
        while n >= MIN_MERGE:
            r |= n & 1
            n >>= 1
        return n + r

    n = len(arr)
    min_run = min_run_length(n)

    for start in range(0, n, min_run):
        end = min(start + min_run - 1, n - 1)
        insertion_sort(arr, start, end)

    size = min_run
    while size < n:
        for left in range(0, n, 2 * size):
            mid = min(n - 1, left + size - 1)
            right = min((left + 2 * size - 1), (n - 1))

            if mid < right:
                merge(arr, left, mid, right)

        size = 2 * size

    return arr
  • The tim_sort function is the main function that implements the Tim Sort algorithm.
  • MIN_MERGE is the minimum size of a run. If the size of the array is less than MIN_MERGE, it is sorted using Insertion Sort.
  • min_run_length calculates the minimum run size based on the length of the array.
  • The array is divided into runs of size min_run and sorted using Insertion Sort.
  • The sorted runs are then merged using the merge function.
  • The size of the merged runs doubles in each iteration until the entire array is sorted.

Example Usage

arr = [5, 21, 7, 23, 19]
sorted_arr = tim_sort(arr)
print("Sorted array is:", sorted_arr)
  • An example list arr is sorted using the tim_sort function, and the sorted list is printed.

Output

Sorted array is: [5, 7, 19, 21, 23]
  • The final output is the sorted array in ascending order.

Summary

Tim Sort is a hybrid sorting algorithm that combines the efficiency of Merge Sort and Insertion Sort. It is the default sorting algorithm in Python due to its performance on real-world data. Tim Sort takes advantage of existing order in the data by identifying and sorting small runs using Insertion Sort and then merging these runs using a merge process. This makes it both adaptive and efficient, with a worst-case time complexity of O(n log n) and a best-case time complexity of O(n) for already sorted data.