Iterating through an array (or other structure) of data is quite a common thing to do in programming. And so far, we’ve covered many different ways to do so: with loops and an index (for-loops
and while loops
), with pointers and pointer arithmetic, and with range-based for-loops
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
#include <array> #include <iostream> int main() { // The type is automatically deduced to std::array<int, 7> (Requires C++17). // Use the type std::array<int, 7> if your compiler doesn't support C++17. std::array data{ 0, 1, 2, 3, 4, 5, 6 }; std::size_t length{ std::size(data) }; // while-loop with explicit index std::size_t index{ 0 }; while (index != length) { std::cout << data[index] << ' '; ++index; } std::cout << '\n'; // for-loop with explicit index for (index = 0; index < length; ++index) { std::cout << data[index] << ' '; } std::cout << '\n'; // for-loop with pointer for (auto ptr{ &data[0] }; ptr != (&data[0] + length); ++ptr) { std::cout << *ptr << ' '; } std::cout << '\n'; // ranged-based for loop for (int i : data) { std::cout << i << ' '; } std::cout << '\n'; return 0; } |
Looping using indexes is more typing than needed if we only use the index to access elements. It also only works if the container (e.g. the array) provides direct access to elements (which arrays do, but some other types of containers, such as lists, do not).
Looping with pointers and pointer arithmetic is verbose, and can be confusing to readers who don’t know the rules of pointer arithmetic. Pointer arithmetic also only works if elements are consecutive in memory (which is true for arrays, but not true for other types of containers, such as lists, trees, and maps).
For advanced readers
Pointers (without pointer arithmetic) can also be used to iterate through some non-sequential structures. In a linked list, each element is connected to the prior element by a pointer. We can iterate through the list by following the chain of pointers.
Range-based for-loops are a little more interesting, as the mechanism for iterating through our container is hidden -- and yet, they still work for all kinds of different structures (arrays, lists, trees, maps, etc…). How do these work? They use iterators.
Iterators
An iterator is an object designed to iterate through a container without having to care about the internal structure of the data.
The simplest kind of iterator is a pointer, which (using pointer arithmetic) works for consecutive data. Let’s revisit a simple array traversal using a pointer and pointer arithmetic:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
#include <array> #include <iostream> int main() { std::array data{ 0, 1, 2, 3, 4, 5, 6 }; auto begin{ &data[0] }; // note that this points to one spot beyond the last element auto end{ &data[0] + std::size(data) }; // for-loop with pointer for (auto ptr{ begin }; ptr != end; ++ptr) // ++ to move to next element { std::cout << *ptr << ' '; // dereference to get value of current element } std::cout << '\n'; return 0; } |
Output:
0 1 2 3 4 5 6
In the above, we defined two variables: begin
(which points to the beginning of our container), and end
(which marks an end point). For arrays, the end marker is typically the place in memory where the last element would be if the container contained one more element.
The pointer then iterates between begin
and end
, and the current element can be accessed by dereferencing the pointer.
Warning
You might be tempted to calculate the end marker using the address-of operator and array syntax like so:
1 |
int* end{ &array[std::size(array)] }; |
But this causes undefined behavior, because array[std::size(array)]
accesses an element that is off the end of the array.
Instead, use:
1 |
int* end{ &array[0] + std::size(array) }; |
Standard library iterators
Iterating is such a common operation that all standard library containers offer direct support for iteration. Instead of calculating our own begin and end points, we can simply ask the container for the begin and end points via functions conveniently named begin()
and end()
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
#include <array> #include <iostream> int main() { std::array array{ 1, 2, 3 }; // Ask our array for the begin and end points (via the begin and end member functions). auto begin{ array.begin() }; auto end{ array.end() }; for (auto p{ begin }; p != end; ++p) // ++ to move to next element. { std::cout << *p << ' '; // dereference to get value of current element. } std::cout << '\n'; return 0; } |
This prints:
1 2 3
The iterator
header also contains two generic functions (std::begin
and std::end
) that can be used:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
#include <array> #include <iostream> #include <iterator> // For std::begin and std::end int main() { std::array array{ 1, 2, 3 }; // Use std::begin and std::end to get the begin and end points. auto begin{ std::begin(array) }; auto end{ std::end(array) }; for (auto p{ begin }; p != end; ++p) // ++ to move to next element { std::cout << *p << ' '; // dereference to get value of current element } std::cout << '\n'; return 0; } |
This also prints:
1 2 3
Don’t worry about the types of the iterators for now, we’ll re-visit iterators in a later chapter. The important thing is that the iterator takes care of the details of iterating through the container. All we need are four things: the begin point, the end point, operator++ to move the iterator to the next element (or the end), and operator* to get the value of the current element.
Back to range-based for loops
All types that have begin
and end
member functions or can be used with std::begin
and std::end
are usable in range-based for-loops.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
#include <array> #include <iostream> int main() { std::array array{ 1, 2, 3 }; // This does exactly the same as the loop we used before. for (int i : array) { std::cout << i << ' '; } std::cout << '\n'; return 0; } |
Behind the scenes, the range-based for-loop calls begin()
and end()
of the type to iterate over. std::array
has begin
and end
member functions, so we can use it in a range-based loop. C-style fixed arrays can be used with std::begin
and std::end
functions, so we can loop through them with a range-based loop as well. Dynamic arrays don’t work though, because there is no std::end
function for them (because the type information doesn’t contain the array’s length).
You’ll learn how to add functions to your types later, so that they can be used with range-based for-loops too.
Range-based for-loops aren’t the only thing that makes use of iterators. They’re also used in std::sort
and other algorithms. Now that you know what they are, you’ll notice they’re used quite a bit in the standard library.
Leave a Reply