Skip to content

Sorting

Working with sorting algorithms
  • Any reasonable library sorting routine takes O(n log n) time, most are based on quicksort or timsort. See Rust sort.
  • Sorting is a good long term solution for repeated searches that can be done faster when sorted.

Python

  • Use a.sort() to sort a in-place, an optional function argument can be used, e.g., a.sort(lambda x: x.value)
  • Use sorted(a) to sort an iterable, similar to how reversed(a) is used.
Example

Rust

  • On integer type vectors (e.g. Vec<i32>) use .sort()
  • On floating-point vectors (e.g. Vec<f32>) use .sort_by(|| …)
rust
let mut vec_int = vec![1, 5, 10, 2, 15];    
vec_int.sort();

let mut vec_flt = vec![1.1, 1.15, 5.5, 1.123, 2.0];
vec_flt.sort_by(|a, b| a.partial_cmp(b).unwrap());
let mut vec_int = vec![1, 5, 10, 2, 15];    
vec_int.sort();

let mut vec_flt = vec![1.1, 1.15, 5.5, 1.123, 2.0];
vec_flt.sort_by(|a, b| a.partial_cmp(b).unwrap());

13.1 Compute the intersection of two sorted arrays

My solution, though a bit different in implementation, was the optimal solution recommended by the authors. difference in implementation stemmed from the fact that I used iterator syntax with next instead of index trackers that I advanced.

An interesting solution is available when the arrays are vastly different in size, e.g. an order of magnitude. In this case, pick the smaller one to iterate over, and do binary search on the large one (using bisect_left).

Lessons Learned
  • Remember that binary search, easily conducted with bisect_left, is a powerful tool to use when working wtih sorted arrays.

13.2 Merge two sorted arrays

Great, good solution, exactly how the authors wanted it, with some personal touches on implementation, this would go well in an interview!

Lessons Learned
  • If you have to merge two arrays and fill one with entries from the other and you don't have a third array, consider starting from the back and entering entries using a swapping mechanism.

13.5 Render a calender

I submitted a good solution, I think. The authors make a good point though in that you could merge the endpoints into a flattened array with an extra field denoting start or end, then count over each event timepoint and increment while it's a start and decrement while its a stop, taking the max with itself on each iteration, similar to the classic max profit problem.

Lessons Learned
  • Annotating your data with a tuple type can be useful for tracking the meaning of the entry in an otherwise homogenous array. Endpoint = collections.namedtuple('Endpoint', ('time', 'is_start')) helps denote whether an endpoint is a start or stop timepoint.

LC 973 K Closest Points to Origin

This question also has a heap solution, see LC 973 K Closest Point to Origin.

Solution: Create a result array, then loop over the points, calculating the distance and appending a tuple with both the distance and point to the result array. Finally, sort the array (take care the ensure the distance is the first element of each tuple in result so it sorts by distance), and return the second entry of the first k elements of the sorted array.

python
from collections import namedtuple

class Solution:
    def kClosest(self, points: List[List[int]], k: int) -> List[List[int]]:
        distance_point = namedtuple('D', ('distance', 'point'))
        result = []

        for p in points:
            x, y = p
            distance = ( x**2 + y**2 )**(1/2)
            result.append(distance_point(distance, p))

        result.sort()
        return [p for distance, p in result[:k]]
from collections import namedtuple

class Solution:
    def kClosest(self, points: List[List[int]], k: int) -> List[List[int]]:
        distance_point = namedtuple('D', ('distance', 'point'))
        result = []

        for p in points:
            x, y = p
            distance = ( x**2 + y**2 )**(1/2)
            result.append(distance_point(distance, p))

        result.sort()
        return [p for distance, p in result[:k]]