What happens when you try to call .find()
/.contains()
/etc on an associative container (std::set
/std::map
/their unordered versions) using a key of a different type? 🤔
1
2
3
std::map<std::string, int> m;
// ...
if (m.contains("foobar")) { ... }
The std::map has the default comparator as the third template parameter - std::less<Key>
. The comparator provides an operator which is called while searching in the container.
The comparator is a struct with bool operator()
. It looks like this:
1
2
3
constexpr bool operator()(const Key &lhs, const Key &rhs) const {
return lhs < rhs;
}
That is, the argument’s type must be the same as the key type. Looking back at the beginning: the "foobar"
arguments has type const char[7]
. It can decay into the const char*
type or casted to std::string_view
or casted to std::string
. Therefore, unfortunately, after overload resolution it is needed to create a temporary object of type std::string
. 💩
We can use a string type with a custom allocator to check whether the temporary objects are created (look at std::string definition). The custom allocator will log allocation requests.
1
using traceable_string = std::basic_string<char, std::char_traits<char>, trace_allocator<char>>;
The strings should be long enough to get around Small String Optimization and make an actual memory allocation. In my environment it should be at least 16 chars long: godbolt link.
So, the problem is that we don’t want to create std::string
temporary objects, because different string types are comparable to each other. The problem was solved in C++14 via an angry hack 🪓: they made the default template parameter std::less<> = std::less<void>
and specialized the std::less<void>
template:
1
2
3
4
template <class Key1, class Key2>
constexpr auto operator()(Key1&& lhs, Key2&& rhs) const
return static_cast<Key1&&>(lhs) < static_cast<Key2&&>(rhs);
}
Now everyone has to remember about std::less<>
and declare containers like this:
1
std::map<std::string, int, std::less<>> m;
then there won’t be temporary objects with memory allocations: godbolt link. This hack is what implied by fancy words heterogeneous lookup.
I couldn’t find the reason why containers don’t use std::less<>
by default. I even checked the STL-like libraries with container realizations and saw two appoaches:
- Homogeneous lookup by default (like in STL) - example boost::container::map.
- Heterogeneous lookup by default - example hash table in Abseil (from Google), it’s an std::unordered_map/set replacement.
The performance is better with heterogeneous lookup, there are benchmarks on unordered containters - it gives speedup between 20% to 35%.