Let’s iterate together

Implementing Python’s zip in C++

zip lets multiple sequences to be iterated over together

To keep things neat and clean, I will be omitting the std namespace qualifier everywhere except where I can’t.

What a nifty (not so) little feature! In a nutshell, iterators enable us to hop between elements in a container, thus, separating code dealing with what to do to those elements without giving much of a thought on how to traverse it.

It was an ecstatic moment when C++11 introduced range-based for-loop. If a container has an iterator interface, i.e. support for begin, end, etc., you could iterate over its elements using this new kind of for-loop. The syntax looks so expressive in its intent.

vector v{1, 2, 3, 4, 5};
for (auto &e : v)
cout << e << endl;

FYI, the above example uses auto and CTAD as well. Interestingly, this is exactly the way we write for-loops in Python.

my_list = [1, 2, 3, 4, 5]
for e in my_list:
print(e)

It’s not uncommon to see code in which two or more sequences (or Iterables in Python lingo) are iterated over together in a loop. One of the ways this can be done is by using a built-in zip function.

a = [1, 2, 3, 4]
b = 'hello'
for x, y in zip(a, b):
print(x, y)
# prints
# 1 h
# 2 e
# 3 l
# 4 l

This is so expressive, compact, and we can do that with any number of iterables. Could we do something similar in C++? Is there anything in C++ corresponding to x, y in zip(a, b) syntax in Python? Structured binding (C++17) comes to mind. We would like to write something like this:

for (auto [x, y, z]: zip(a, b, c)) { /* ... */ }

This code roughly gives us an idea of the requirements that zip needs to fulfill.

  1. The call to zip must return something which has an iterator interface. That is, it must have begin and end methods defined.
  2. The iterator obtained from the return object must have operator*()(de-reference), operator++() (pre-increment) and operator!=() (inequality) operator methods defined. These would be consumed by the (client’s) for-loop internally.
  3. zip should support any number of arguments of any type (container/container-like type of course).

Taking into account of the requirement mentioned previously, we can write a skeleton code for zip as below:

template <typename ...T>
struct zip
{
struct iterator { /* ... */ };
zip(const T &...args)
: data(forward_as_tuple(forward<T>(args)...)) {}
auto begin() { return iterator{ /* ... */ }; }
auto end() { return iterator{ /* ... */ }; }
private:
tuple<T &...> data;
};

To keep it simple, I’m using a tuple of references because I don’t want any ownership of its arguments. Now, we need to figure out the commented parts.

In the previous for-loop, the structured binding will work if zip::iterator‘s de-reference operator returns a tuple. This tuple needs to contain reference of elements of each container in zip::data. To mediate between these two tuples, we need another tuple containing iterators of each container. We can keep this tuple inside zip::iterator. In a sense, this struct acts as a proxy for the tuple of iterators.

But how do we find the iterator type of each container? Recall that when we call begin on a container we get its iterator. Using decltype and declval, we can get its type, viz. decltype(begin(declval<T&>())). In C++20, we can write ranges::iterator_t<T> instead.

Besides that, it’s good practice to have the usual nested types the standard library iterators have. This allows for better interoperability with the algorithms from the standard library. Using iterator traits, we can get the required types from an iterator type, viz. iterator_traits<ranges::iterator_t<T>>::value_type, iterator_traits<ranges::iterator_t<T>>::reference etc. With C++20, we can write the corresponding types as iter_value_t<ranges::iterator_t<T>, iter_reference_t<ranges::iterator_t<T>etc. respectively.

Since we have a tuple of Ts, the above nested types will be written as a tuple of these types in zip::iterator as shown below:

struct iterator
{
using iterator_category = input_iterator_tag;
using value_type = tuple<iter_value_t<ranges::iterator_t<T>>...>;
using reference = tuple<iter_reference_t<ranges::iterator_t<T>>...>;
using difference_type = tuple<iter_difference_t<ranges::iterator_t<T>>...>;
using pointer = std::tuple<typename iterator_traits<ranges::iterator_t<T>>::pointer...>;
tuple<ranges::iterator_t<T>...> data_;
};

In addition to that, this struct also needs to define the previously mentioned operator methods which we will get to shortly.

The begin and end methods of zip return an iterator. In begin, we need to pass a tuple of iterators to the zip::iterator. How do we get iterators of each container in zip::data? We call begin on each container. Duh!

Luckily, we don’t have to reinvent the wheel. We already have with us the apply function from C++17. It takes a callable and a tuple as arguments.

template <class F, class Tuple>
constexpr decltype(auto) apply(F&& f, Tuple&& t);

What this function essentially does is, it calls the callable f with the tuple elements of t as arguments. Inside the callable, we can call std::begin on each argument. Voila! We have transformed a tuple of containers to a tuple of iterators. Our begin method would look like this:

auto begin()
{
return iterator(apply([]<typename... _Tp>(_Tp && ...e) {
return make_tuple(std::begin(forward<_Tp>(e))...);
}, data));
}

This code uses explicit template parameter lists introduced in C++20. Alternatively, one can use the generic lambda form if C++20 is not available; in which case, replace _Tp && …e with auto && …e and forward<_Tp> with forward<decltype(e)>. We can write the end method similarly. Just replace the inner std::begin call with an std::end call.

Okay, now we are left with defining the operator methods of zip::iterator. Since, this struct is a proxy for underlying iterators we have kept inside a tuple zip::iterator::data_, we want to simply call the corresponding operators of those iterators. We can use the previous technique using apply function again:

struct iterator
{
/* nested type declarations */
reference operator*() {
return apply([]<typename... _Tp>(_Tp && ...e) {
return forward_as_tuple(*forward<_Tp>(e)...);
}, data_);
}
iterator &operator++() {
return apply([]<typename... _Tp>(_Tp && ...e) {
return make_tuple(++forward<_Tp>(e)...);
}, data_);
}
bool operator!=(const iterator &iter) const { /* ... */ }
tuple<ranges::iterator_t<T>...> data_;
};

The inequality operator is a little bit tricky though. The for-loop (of client code which uses zip) should end the iteration when any one of the underlying iterators become end-iterators. So, this method should compare its data_ with iter::data_ comparing element by element and returning false when any of the corresponding elements are equal.

Since this comparison involves two tuples, we cannot use apply directly. We need to write some helper functions. Also, we need a mechanism to unpack the tuples’ elements. We can use index_sequence for this mechanism.

namespace detail
{
template <typename... _Tp>
bool variadic_or(_Tp &&...args) { return (... || args); }
template <typename Tuple, std::size_t... I>
bool any_equals(Tuple &&t1, Tuple &&t2, index_sequence<I...>)
{
return variadic_or(get<I>(forward<Tuple>(t1)) == get<I>(forward<Tuple>(t2))...);
}
}

Now we can write the inequality operator method as follows:

bool operator!=(const iterator &iter)
{
return !detail::any_equals(data_, iter.data_, index_sequence_for<T...>{});
}

Our implementation is now complete. Using zip we can now write in C++, the code below:

vector v{1, 2, 3, 4};
string s = "hello";
double d[] = {1.2, 2.3, 3.4};
for (auto [x, y, z] : zip(v, s, d))
cout << x << ' ' << y << ' ' << z << endl;
/* prints
* 1 h 1.2
* 2 e 2.3
* 3 l 3.4
*/

Isn’t that neat! Here’s a link with the implementation: https://godbolt.org/z/neaTjY3P9.

Let me know what you think about this in the comments section. Suggestions are always welcome. Happy coding :)

--

--

Software Developer @VMware. Interested in music, photography, physics.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Debashish Ghosh

Software Developer @VMware. Interested in music, photography, physics.