Daily bit(e) of C++ | Top k frequent values
Daily bit(e) of C++ #231, Common interview problem: Top k frequent values.
Today we will look at a common C++ interview problem: Top k frequent values.
Given a list of integers as std::vector<int> and the count k, return the top k most frequent values in the list in any order.
You can assume that the input leads to a unique solution, and your solution is required to operate faster than O(n*logn).
For example, for the input {1,3,1,2,5,3,4} and k==2, the output should be {1,3}.
Before you continue reading the solution, I encourage you to try to solve it yourself. Here is a Compiler Explorer link with a couple of test cases: https://compiler-explorer.com/z/7E5bjq8sb.
Solution
This problem is a “How well do you know the standard library?” type of problem.
We are left with only a few options since we have to provide a solution that runs faster than O(n*logn). Technically, there is an O(n) solution using std::nth_element (i.e. QuickSelect); however, QuickSelect has a high constant overhead. Instead, we can provide a O(n*logk) solution, for which have several options:
ordered containers: std::priority_queue, std::set
heap algorithms: std::push_heap, std::pop_heap
partial sort: std::partial_sort, std::partial_sort_copy
In particular, using std::partial_sort_copy allows us to implement a simple two-step solution.