5.31. Iterators and generators¶
The for loop has been doing more work than it looks like.
This page covers the iterator protocol it runs on, and the
yield keyword that lets you build your own iterators.
5.31.1. The iterator protocol¶
Every object that can be looped over implements two methods:
__iter__()– return an iterator over the object’s items.__next__()– on the iterator, return the next item or raiseStopIterationwhen there are no more.
The iter() built-in calls __iter__; next() calls
__next__. Step through a list by hand:
it = iter([10, 20, 30])
print(next(it)) # 10
print(next(it)) # 20
print(next(it)) # 30
print(next(it)) # raises StopIteration
for is sugar for “call __iter__ once, then loop on
__next__ until StopIteration.”¶
What for x in items: actually does:
_it = iter(items)
while True:
try:
x = next(_it)
except StopIteration:
break
# ... loop body ...
Every list, tuple, string, dict, set, file object, and generator
already implements __iter__ and __next__ – which is why
they all work with for.
5.31.2. yield and generator functions¶
A function that contains a yield statement is a generator
function. Calling it does not run the body; it returns a
generator object (an iterator) that runs the body one
yield at a time:
def count_up_to(n):
i = 0
while i < n:
yield i
i += 1
for value in count_up_to(3):
print(value)
Output:
0
1
2
Each call to next() resumes the function until the next
yield, hands that value to the caller, and pauses there. The
local state (i in this case) is preserved between resumes.
next() runs the body up to the next yield, hands the
value back, and pauses. Local state survives the pause.¶
Generators are the easiest way to produce a sequence lazily – no list is built, items are computed only when the consumer asks for them, and the function can yield items forever if it wants.
5.31.3. Lazy pipelines¶
Generators compose well. One generator’s output can feed another:
def numbers():
i = 0
while True:
yield i
i += 1
def squares(source):
for x in source:
yield x * x
pipeline = squares(numbers())
for v in pipeline:
if v > 100:
break
print(v)
The values flow through the pipeline one at a time – no
intermediate list, no upper bound built in to numbers, and
the consumer (for v in pipeline) decides when to stop.
Each next() on the consumer triggers one pull through
the chain; values exist only when something asks for them.¶
5.31.3.1. When yield runs out¶
Falling off the end of a generator function (or hitting an
explicit return) raises StopIteration automatically.
There is no need to raise it by hand; the surrounding for
loop sees it and ends.
Use generators when the producing code is naturally written as a loop with a few yield points; use a plain list comprehension when you genuinely need the whole sequence in memory.