Daily bit(e) of C++ | Bounded subarrays
Daily bit(e) of C++ #175, Common interview problem: Bounded subarrays.
Today we will look at a common C++ interview problem: Bounded subarrays.
Given an array of integers and two boundary values, A and B, return the number of (contiguous) subarrays for which the value A is the minimum and value B is the maximum of the subarray.
For example, for the array {1,2,3,4,3} and bounds [2,4], there are only two valid subarrays: {2,3,4} and {2,3,4,3}.
Before you continue reading the solution, I encourage you to try to solve it yourself. Here is a Compiler Explorer link with a couple of test cases: https://compiler-explorer.com/z/6d84jE5c9.
Solution
The first observation to consider is that values outside the range [A, B] effectively create boundaries and divide our array into subsections that cannot interact.
So let’s consider what happens in one of these chunks when we observe the values A and B.
Consider A==2 and B==4. With an array {3,3,2}, we have zero instances of a subarray. Adding another value {3,3,2,4}, we jump to three instances ({3,3,2,4},{3,2,4},{2,4}). And in fact, we can calculate this number from the positions of A and B: std::min(a_idx,b_idx)+1. Adding another value that isn’t A or B increases the number of instances by this amount. For {3,3,2,4,3}, we are at six instances (adding {3,3,2,4,3},{3,2,4,3},{2,4,3}).
Adding A has a more interesting effect. Another A means that we can omit the previous instance of A. For {3,3,2,4,2}, we have the original three instances and four more on top of that ({3,3,2,4,2},{3,2,4,2},{2,4,2},{4,2}), which is again std::min(a_idx,b_idx)+1, but taking into account only the rightmost instances of A or B. You might also notice that B has become the limiting factor in our std::min expression.
So if we only had values from the range [A, B], we could scan the array, keeping track of the last instance of A and B, calculating the number of instances for each index.
Fortunately, boundary values do not break this approach. Consider that even when we do not have a boundary, we are already calculating the number of instances against the natural boundary of the array. The only change with boundary values is that they effectively move the start of the array for our computation.