Daily bit(e) of C++ | Learn Modern C++ 5/N
Daily bit(e) of C++ #132, A Modern-only C++ course (including C++23), part 5 of N: The standard library
In previous lessons, we have covered basic syntax, types and indirection. This gives you enough information to use the standard library and build complex programs. If you missed the previous lesson, you can find it here:
So today, we will stroll through the standard library, reviewing some new language concepts as needed.
To navigate the vast standard library in a user-friendly format, I can recommend cppreference.com. The website can be overwhelming, but it is a good and up-to-date reference for C++ language and library facilities.
Namespaces
So far, we have been working in the global namespace. However, you might have noticed that every type from the standard library was prefixed with std::, the namespace reserved for the standard library.
The reason why we have namespaces in the first place is to prevent name conflicts. Each symbol in C++ requires a unique definition; however, many frequently used names crop up across libraries. Without namespaces, we would have to name every symbol based on the library name, e.g. instead of “connect”, we would have to name the function “simplesockets_socket_ipv4_connect”. Namespaces allow nested naming and local objects to refer to each other without spelling out the entire namespace path.
When needed, we can also pull a specific namespace into the local or global namespace. However, pulling a namespace into the global namespace can introduce unexpected conflicts and should be avoided (mainly, never use “using namespace” in a header).
When pulling a specific namespace into a local one, you can limit the effect by introducing a local scope.
Date and time
C++ offers a fully featured time and date manipulation library. However, as time manipulation is complex, you should treat this section as a light overview.
There are two primary clocks to consider when we want to obtain time: the system clock and the steady clock. The system clock matches the system time and should be used when working with the actual time (UTC).
The system clock’s problem is that it can be externally adjusted (when synchronizing the system’s clock with time servers). This poses a problem when we try to make accurate measurements by capturing specific time points.
The steady clock is a monotonic clock that is not externally adjusted and is meant for measuring time periods, for example, performance metrics. It is unrelated to system time (it can be the time since the last system reboot).
The supported arithmetic operations follow the expected semantics. Time literals represent durations. Durations can be added together or multiplied with scalars. Adding a duration to a time point produces a new time point with the desired offset. As we have seen in the example, a difference of two time points is a duration. Negative durations are supported as well.
The std::chrono library also supports date manipulation in a very similar fashion. However, we need to distinguish time points and dates. When manipulating dates, we have to first convert the date into one of the supported time points. This is why we convert our dates into sys_days (UTC) in the following example.
Aside from specific dates, we often refer to pseudo-dates. This ranges from simple constructs, like the last day in a month, to more complex ones, like the 4th Thursday in November (US Thanksgiving). The std::chrono library can also express these constructs; however, to produce an actual date, they must be converted to a year_month_day.
Finally, we also can work with timezones. As a reminder, the system_time is UTC-based, local_time is effectively unzoned time, and once we combine local_time with a timezone, we get zoned_time.
Filesystem
The std::filesystem library offers filesystem exploration, manipulation and querying tools.
Files and directories are identified by their paths, which are, by default, relative. The std::filesystem::equivalent comparator can be used to check whether two paths refer to the same filesystem entity.
Directory content can be enumerated using directory_iterator or recursive_directory_iterator (which will additionally explore sub-directories).
The std::filesystem library also exposes all typical file manipulation operations.
Revisiting I/O
In the previous example, we used a new type of stream, std::ofstream. Similar to std::cin and std::cout, files are also represented by streams.
There is only one standard input and one standard output (there is also standard error output in the form of std::cerr). In contrast, the streams representing files need to be instantiated as needed, std::ifstream for files we want to read from and std::ofstream for files we want to write to or create.
When we overload the stream insertion and extraction operators for a custom type, it will also work for file streams (how exactly is a topic for a future lesson).
Structured binding
Before discussing associative containers, it’s worth mentioning another language feature: structured binding.
Maps store their elements as std::pair of the key and value. Unfortunately, since the two members of std::pair are named first and second, this often leads to unreadable code (especially in nested situations).
Structured bindings allow the deconstruction of tuple-like types (std::pair, std::tuple, aggregates) into more appropriately named variables. Importantly the named variables are always indirect (references); however, how we capture the outer variable still matters.
Associative containers
We already talked about std::vector, std::array and std::string. These array-style containers store their elements in a contiguous memory block. Fortunately, we are not limited to arrays.
The std::unordered_set and std::unordered_map represent associative unordered containers (a.k.a. hash-maps) that offer amortized O(1) lookup, erase and insertion.
The std::unordered_map is a robust data structure; however, it does impose restrictions on the key. First, the type of key must be hashable.
The quality of hashes is an advanced topic and very much beyond the scope of this course. Therefore, we will rely on a boost library. Boost is a commonly used set of libraries offering more niche functionality than the standard library. Also, it serves as an incubator for features that can later make it into the standard library.
I have set up the homework environment so you can access the boost libraries as long you have the corresponding dependency in the BUILD file. For example, for boost::hash_combine, the dependency is @boost//:container_hash.
The second limitation of std::unordered_map relates to the structure. Since the location of the key-value pair in the internal storage of std::unoredered_map depends on the value of the key, we cannot mutate the key in place.
Finally, because the order of elements in a std::unordered_map is based on the storage and hash function implementation details, we should never rely on the actual order of elements.
std::unordered_set
The standard library also offers a std::unordered_set. This container behaves like a std::unordered_map with a similar API; however, it only stores keys and no mapped values. This is handy when we need an O(1) lookup, for example, to check whether a given key was already processed.
std::map, std::set
The unordered prefix might point you to the existence of ordered associative containers. These offer identical APIs to the unordered containers; however, instead of being hash-based, they store their keys sorted and stored in a tree structure. This leads to O(log(n)) complexity for the operations where unordered containers offer O(1) complexity.
Ordered containers should be avoided unless you explicitly require the order of elements to be always sorted.
Specialized containers
Two more specialized containers are std::deque - a double-ended queue and std::list - a doubly-linked list.
The double-ended queue is convenient for scenarios when we require an unbounded queue.
The doubly-linked list provides a convenient element splicing and re-ordering in O(1).
Container adapters
The std::stack, std::queue and std::priority_queue are container adapters. Container adapters wrap an existing container with a specialized interface.
The main benefit of container adapters is that they communicate the intention and strictly maintain the invariants of the respective structure.
The std::queue and std::stack model the respective FIFO and LIFO data structures.
The std::priority_queue models a priority queue with O(log(n)) operation complexity and can offer better performance than std::set for “always sorted” scenarios that do not require iteration over elements.
Plenty more in the standard library
As I mentioned in the beginning, the standard library is vast. In the next lesson, we will discuss another big chunk of the standard library, the algorithms.
While we will cover some additional topics in later lessons, many of the topics will remain for your study:
memory management
error handling
regular expressions
concurrency
metaprogramming
localization
utilities
Take your time and explore the available facilities on cppreference.com.
Homework
Because the standard library opens up the possibilities, this lesson will introduce a jump in homework complexity. You will need to take the time and research which tools to use from the standard library; often, the exact functions or types might not have been mentioned during the lesson (but will always be adjacent to what we talked about).
The template repository with homework for this lesson is here: https://github.com/HappyCerberus/daily-bite-course-05.
As with all homework, you will need VSCode and Docker installed on your machine and follow the instructions from the first lesson.
The goal is to make all tests pass as described in the readme file.