Daily bit(e) of C++ | Lexicographically smallest subsequence that contains all unique letters
Daily bit(e) of C++ #294, Common interview problem: Lexicographically smallest subsequence that contains all unique letters.
Today, we will look at a common C++ interview problem: Lexicographically smallest subsequence that contains all unique letters.
Given a string, determine the lexicographically smallest substring (non-consecutive) that contains one instance of each unique character from the string.
For example, for the string “The quick brown fox jumps over the lazy dog”, the lexicographically smallest substring is “T qickbownfxjumpsverthlazydg”.
Before you continue reading the solution, I encourage you to try to solve it yourself. Here is a Compiler Explorer link with a couple of test cases: https://compiler-explorer.com/z/x4Wsr6jjE.
Solution
The problem is fairly constrained. First, the resulting string must contain all unique letters. Because of this, the only difference between potential results is the order (i.e. the permutation) of these letters.
The order is also constrained. First, the result must be a subsequence of the input. Second, we are looking for a lexicographically minimal string.
If we want to optimize for lexicographic ordering, we want to order letters with low values as early as possible. However, this is where the first constraint comes in. We cannot order a letter ahead of another if there are no further instances later in the string.
If we put this together, we can use a monotonic buffer approach. In this case, we limit ourselves to poping characters with a higher value and a later instance in the string.