Exploring Generators in Python
In the last blog I talked about Iterators and Iterables and I am assuming you're familiar with both of the concepts. So moving forward from let's talk about generators.
Simply put generators are iterators with a yield keyword and they do not return they yield. Similarly, a generator function is one that has a yield
keyword in its body.
Let's look at some code and find out a bit more about them so we can define them more formally.
def range_123():
print("Start")
yield 1
yield 2
yield 3
print('End')
for number in range_123():
print(number)
"""
OUTPUT:
Start
1
2
3
End
"""
numbers = range_123() # Assigning generator object to numbers
next(numbers) #Output -> 1
next(numbers) #Output -> 2
next(numbers) #Output -> 3
next(numbers) #Output -> StopIteration Error
When we look closely into the above code range_123()
is a generator function. Since generators are iterator we can directly iterate over the whole iterator function or we could assign it to a generator object and then use the next
keyword to iterate over it until it's exhausted and raises the StopIteration
error in a manner of confirming with the IteratorPrortocal
.
Now you must be wondering what is the difference between the yield and return?
When a return statement is invoked inside a function, it permanently passes control back to the caller of the function and disposes of a function's local state.
When a yield is invoked, it also passes the control back to the caller of the function but it only does so temporarily. It suspends the function and retains its local state.
def greeter(name):
while True:
yield f'Hello {name}'
gen_object = greeter('Pradhvan')
next(gen_object) # Output -> Hello Pradhvan
next(gen_object) # Output -> Hello Pradhvan
next(gen_object) # Output -> Hello Pradhvan
next(gen_object) # Output -> Hello Pradhvan
If we look at the above code we could clearly see that local variable are stashed away temporaily, suspending the function and giving control back to the caller while retaining it's local state.
Since it's doing a lazy evaluation it can be continued anytime with the next()
on the generator, which can evaluate somewhat infinitely long series of greeting messages.
Let's look at one more example of a code snippet where multiple yield
statements decide the flow of the function.
def repeater():
while True:
print("Start")
yield 1
yield 2
print("end")
gen_obj = repeater()
next(iterator) # 1
next(iterator) # 2
next(iterator) # 3
"""
OUTPUT # 1
start
1
OUTPUT # 2
2
OUTPUT # 3
end
start
1
"""
The above example makes it clear that in a generator function the flow of control of where the function suspends is decided by the yield statement. As the #2
suspends the value at 2 and when we do next()
on 3
we get the whole block of statements.
Generator Expression
A generator function can be replaced with a generator expression. These are similar to list comprehensions which that eagerly builds a list, generator expressions return a generator that can lazily produce the items.
def range_123():
print("Start")
yield 1
print("Middle")
yield 3
print("End")
res1 = [x*3 for x in range_123()]
"""
Output res1:
Start
Middle
End
"""
for i in res1:
print("-->",i)
"""output:
--> 3
--> 9
"""
- The list comprehension eagerly iterates over the items that are to be yielded and prints the
Start
Middle
andEnd
. - When the for is iterated over the list produced by the res1, it returns the item that are to be yielded.
def range_123():
print("Start")
yield 1
print("Middle")
yield 3
print("End")
res2 = (x*3 for x in range_123())
print(res2) # <generator object <genexpr> at 0x7f8be1d09150>
for i in res2:
print("-->",i)
"""
Output
Start
-->i
Middle
-->i
End
"""
- In the case of generator expression, when the for loop iterates over the generator object
res2
, the body of the generator functionrange_123()
actually executes. - Each iteration calls the
next()
while the iteration advances till aStopIteration
is raised.
Since comprehension is a great way to increase the readability of your code and if you're using generator expression, you're making the comprehension more memory efficient.
But sometimes we tend to overuse the whole comprehension feature which backfires, I found a great article Overusing list comprehensions and generator expressions in Python which you should definitely look into.