pradhvan

Python geek who loves to play around with web technologies.

This is the second part of the series, in the first part we talked about the general idea of concurrency, how it's different from parallelism and saw how Python handles concurrency.

Part 1: Talking Concurrency -1

In the second part of the blog, we will look into the modern solution towards the problem using the new Asyncio module.

Import Asyncio

In the last post, we looked into a basic code snippet on how can we write concurrently. We also discussed some of the basic terminology used while using the Asyncio module. If you don't remember you should quickly take a recap as we would look at those concepts in a bit detailed manner.

Before looking at some code, let's understand some basic terminologies that would help in understanding the code better.

  • Eventloop: it's an infinite loop that keeps track of all the running tasks. It manages all the suspended functions and executes them when the time is right. These functions are stored in the queue called as the Task Queue, the event loop constantly polls the task queue and passes them to the event loop. When a task is passed on to the event loop it returns back a future object.

  • Future: a future is an indirect reference to a forthcoming result. It can loosely be translated as promise you make to do something when a condition is met, so when the condition is met a future can “callback” when ready to be executed. Since everything is an object in python, future is also an object that has the __await__() method implemented and its job is to hold a certain state and result. The state can be one of three things:

Pending: it does not have a result or exception yet. Cancelled: it was canceled Finished: it was finished either with a result or exception.

Futures also have a method called the add_done_callback() this is method allows the function to be called as soon as the task is completed with its process and is returned with a result. Which is the python object that would be returned with the expected result or raise an exception when the task is finished.

  • Tasks: a task executes a coroutine in an event loop. In a program, asyncio.create_task(coroutine) wraps the coroutine into a task and schedules its execution. asyncio.create_task(coroutine) returns a task object. Every time a coroutine is awaited for a future, the future is sent back to the task and binds itself to the future by calling the add_done_callback() on the future. From now on if the state of the future changes from either canceled or finished, while raising an exception or by passing the result as a python object. The task will be called and it will rise back up to its existence.

Since a typical program will have multiple tasks to be executed concurrently, we create normally with asyncio.create_task(coroutine) but we run them with asyncio.gather().

  • Coroutine: Asyncio was introduced in Python 3.4, initially it started off as decorator based coroutines @asyncio.coroutine which used a yield from keyword. Later in Python 3.5 async and await keywords were introduced which made working/reading concurrent code much easier. I won't go into much detailed on how coroutines evolved to the new async def keyword, because I planning to write a separate blog on that.

As we looked into the basic definition of coroutines in the last blog, we can loosely describe them as restartable functions.

You make a coroutine with the help of the async def keyword and you can suspend the coroutine with the await keyword. Every time when you await the function gets suspended while whatever you asked to wait on happens, and then when it's finished, the event loop will wake the function up again and resume it from the await call, passing any result out. Since coroutines evolved from generators and generators are iterators with __iter__() method, coroutines also have __await__() which allows them to continue every time await is called.

At each step a coroutine does three things:

  • It either awaits a future
  • It awaits another coroutine
  • It returns a result.

Before moving forward, I want to talk about await. In Python, anything that can be awaited i.e used with the await keyword is called an awaitable object. The most common awaitable that you would use would be coroutines, futures and tasks. Thus anything is blocking get's put to the event loop using the await and added to the list of paused coroutines.

Now let's look at a very basic async program to understand how everything fits in together.

import asyncio

import asyncio

async def compute(x, y):
    print("Compute %s + %s ..." % (x, y))
    await asyncio.sleep(1.0)
    return x + y

async def print_sum(x, y):
    result = await compute(x, y)
    print("%s + %s = %s" % (x, y, result))

asyncio.run(print_sum())

The sequence diagram below describes the flow of the above program.

tulip_coro.png

Now that we know all the basic terminology used in an async program let's look at a slightly complex code below for getting a better understanding all the jargons we learned above.

import asyncio


async def compute(x, y):
    """
    A coroutine that takes in two values and returns the sum.
    """
    print(f"Computing the value of {x} and {y}")
    await asyncio.sleep(1)
    return x + y


async def print_sum():
    """
    A coroutine that creates tasks.
    """
    value1 = asyncio.create_task(compute(1, 0))
    value2 = asyncio.create_task(compute(1, 0))
    value3 = asyncio.create_task(compute(1, 0))
    print(sum(await asyncio.gather(value1, value2, value3)))

asyncio.run(print_sum())

async def print_sum() and async def compute() are the two coroutines in the above program, the async def print_sum() as the main function used in the sync programming. The main function executes the entire program and all the functions related to it. The same approach is followed here, one coroutine awaits all the other coroutine.

Though this can be easily miss-understood, in that case, the program would just fine but would run in more like a sequential manner.

    value1 = await asyncio.create_task(compute(1, 0))
    value2 = await asyncio.create_task(compute(1, 0))
    value3 = await asyncio.create_task(compute(1, 0))
    print(sum(value1, value2, value3))

The above code can be a good example of how not to write async code, here using await on every task we are making all the calls sync thus making the program sequential. To avoid this asyncio.gather() is used in the program. To gather all the tasks in the program, value1, value2 and value3.

Finally, when all the tasks are gathered together, they are run concurrently.

Sync-Async-Sync

A lot of time you might be in a situation where you might have to call a sync function def from coroutine async def or have to call coroutine async def from sync function def. Ideally, you “shouldn't” use sync functions for calls that can be async like a database call because that is something that could provide further optimization. But there is nothing wrong with using a synchronous library for database, an async library for HTTP and gradually move things to async.

  • Sync-Async

Calling a sync function def from a coroutine async def. In that case, you run the sync function in a different thread using the threadpool executor. The runinexecutor() method of the event loop takes an executor instance, a regular callable to invoke, and any arguments to be passed to the callable. It returns a Future that can be used to wait for the function to finish its work and return something.

import asyncio
import concurrent.futures

def blocking_io():
    # File operations (such as logging) can block the
    # event loop: run them in a thread pool.
    with open('/dev/urandom', 'rb') as f:
        return f.read(100)

def cpu_bound():
    # CPU-bound operations will block the event loop:
    # in general it is preferable to run them in a
    # process pool.
    return sum(i * i for i in range(10 ** 7))

async def main():
    loop = asyncio.get_running_loop()

    ## Options:

    # 1. Run in the default loop's executor:
    result = await loop.run_in_executor(
        None, blocking_io)
    print('default thread pool', result)

    # 2. Run in a custom thread pool:
    with concurrent.futures.ThreadPoolExecutor() as pool:
        result = await loop.run_in_executor(
            pool, blocking_io)
        print('custom thread pool', result)

    # 3. Run in a custom process pool:
    with concurrent.futures.ProcessPoolExecutor() as pool:
        result = await loop.run_in_executor(
            pool, cpu_bound)
        print('custom process pool', result)

asyncio.run(main())
  • Sync-Async

When you have to call coroutines from the normal sync function. You just have to manually get_event_loo() , create tasks() and call the asyncio.gather() function. Since you can await, one thing you can do is create a queue with asyncio.queue() and use that queue to pass around the data between different coroutines.


import asyncio


async def compute(x, y, data):
    print(f"Computing the value of {x} and {y}")
    result = x + y
    await data.put(result)


async def process(n, data):
    processed, sumx = 0, 0
    while processed < n:
        item = await data.get()
        print(item)
        processed += 1
        value = item
        sumx += value
    print(f"The sum is:{sumx}")
    await asyncio.sleep(.5)


def main():
    loop = asyncio.get_event_loop()
    data = asyncio.Queue()
    sum1 = loop.create_task(compute(1, 4, data))
    sum2 = loop.create_task(compute(0, 0, data))
    sum3 = loop.create_task(process(2, data))
    final_task = asyncio.gather(sum1, sum2, sum3)
    loop.run_until_complete(final_task)


if __name__ == '__main__':
    main()

What now?

  • Just to get a better understanding of all the next syntax you learned, you can try out a sample problem mentioned below.

Write a program that reads log files and refires those URLs that have a 5xx status code. Once the refiring is done just add the &retry=True in the prefix of the URL and store them in a separate log file.

The log file will be a text file, you can check out a sample file here.

  • As I am still exploring the concept concurrency so I don't exactly know the best practices and pitfalls you should avoid while writing async code, but I highly recommend you check out asyncio: We Did It Wrong – roguelynn. This article can be a good followup after you are done with this one and are comfortable with syntax of asyncio.

Just before ending the blog I would like to thank maxking and Jason Braganza for helping me out in the blog.

In the next part of the series, I will be talking about threads and finally will conclude the series with asyncio based frameworks such as quart and aiohttp.

Happy Coding!

Whenever we think of programs or algorithms we think of steps that are supposed to be done one after the other to achieve a particular goal. Let's take a very simple example of a function that is supposed to greet a person:

def greeter(name):
    """Greeting function"""
    print(f"Hello {name}")

greeter(Guido) #1
greeter(Luciano) #2
greeter(Kushal) #3
"""
Output:
Hello Guido
Hello Luciano
Hello Kushal
"""

Here the function greeter() greets the person who's name is passed through it. But it does it sequentially i.e when greeter(Guido) will run the whole program will block it's state unless the function executes successfully or not. If it runs successfully then only the second and third function calls will be made.

This familiar style of programming is called sequential programming.

Why concurrency?

Sequential programming is comparatively easy to understand and most of the time fit the use case. But sometimes you need to get most out of your system for any X reason, the most common substituent of X, I could find is scaling your application.

Though greeter() is just a toy example but a real-world application with real user need to work the same even on huge amount of traffic it receives. Every time you get that spike in your traffic/daily active user you can't just add more hardware so one of the best solutions at times is to utilize your current system to the fullest. Thus Concurrency comes into the picture.

Concurrency is about dealing with lots of things at once. – Rob Pike

Challenges in writing concurrent programs

Before I move forward, I know what most of the people will say. If it's that important why at work/college/park/metro station/.. people are not talking about it? Why most of the people still use sequential programming patterns while coding?

Because of a very simple reason, it's not easy to wrap your head around and it's very easy to write sequential code pretending to be concurrent code.

concurrency-comic

I got to know about this programming style very late and later when I talked to people they said the same thing. It's not easy to code, you can easily skip the best practices and very hard to debug so most of the people try to stick to the normal style of programming.

How Python handles concurrency?

The two most popular ways(techniques) of dealing with concurrency in Python is through:

  1. Threading
  2. Asyncio

Threading: Python has a threading module that helps in writing multi-threaded code. You can spawn independent threads share common states (just like a common variable that is accessed by two independent threads).

Let's re-write that greeter() function again now with threads.

import threading 
import time
def main():
    thread1 = threading.Thread(target=greeter, args=('Guido',))
    thread2 = threading.Thread(target=greeter, args=('Luciano',))
    thread3 = threading.Thread(target=greeter, args=('Kushal',))
    thread1.start()
    thread2.start()
    thread3.start()

def greeter(name):
    print("Hello {}".format(name))
    time.sleep(1)
    
if __name__ == '__main__':
    main()

"""
Output:
Hello Guido
Hello Luciano
Hello Kushal
"""
    

Here thread1, thread2, thread3 are three independent threads that run alongside main thread of the interpreter. This may look it is running in parallel but it's not. Whenever the thread waits(here it's a simple function so you might see that), this wait can be anything reading from a socket, writing to a socket, reading from a Database. Its control is passed on to the other thread in the queue. In threading, this switching is done by the operating system(preemptive multitasking).

Though threads seem to be a good way to write multithreaded code it does have some problems too.

  • The switch between the threads during the waiting period is done by the operating system. The user does not have control over it.
  • Python has this lock called the GIL(Global Interpreter Lock) and the thread which holds the GIL can only run, others have to wait for its turn to get the GIL than only they can proceed. Which is great if you're doing an I/0 bound task but sucks if you're doing a CPU bound task.

Asyncio: Python introduced asyncio package in 3.4, which followed a different approach of doing concurrency. It brought up the concept of coroutines. A coroutine is a restartable function that can be awaited(paused) and restarted at any given point. Unlike threads, the user decides which coroutine should be executed next. Thus this became cooperative multitasking.

Asyncio brought new keywords like async and await. A coroutine is defined with the async keyword and is awaited so that the waiting time can be utilized by the other coroutine.

Let's rewrite the greeter() again but now using the Asyncio.

import asyncio


async def greeter(name):
	await asyncio.sleep(1)
	print(f'Hello {name}')


def main():
    loop = asyncio.get_event_loop()

    task1 = loop.create_task(greeter('Guido'))
    task2 = loop.create_task(greeter('Luciano'))
    task3 = loop.create_task(greeter('Kushal'))

    final_task = asyncio.gather(task1, task2, task3)
    loop.run_until_complete(final_task)


if __name__ == '__main__':
    main()

"""
Output:
Hello Guido
Hello Luciano
Hello Kushal
"""

Looking at the above code we see some of the not so common jargons thrown around, event loop, tasks, and a new sleep function. Let's understand them before we dissect the code and understand it's working.

  • Event loop: it's one of the most important parts of the async code, this is a simple code that keeps on looping and checks if anything has finished it's waiting and needs to be executed. Only a single task can be run in an event loop at a time.
  • Coroutines: here the greeter() is a coroutine which prints the greeting, though this is a simple example but in an I/0 bound process a coroutine needs to wait so await helps the program to wait and get off the event loop. The async.sleep() function is different from the time.sleep() because async.sleep() is a non blocking call i.e it does not hold the program until the execution is completed. The argument given to the async.sleep() is the at the most value of the wait.
  • Tasks: since a calling, a coroutine does not return the value of the coroutine it returns a coroutine object. Separate tasks are created that can function independently with the help of the coroutine.

Now let's move on to the code. Here task1,task2 and task3 work concurrently calling the coroutine. Once all the tasked are gathered the event loop runs until all the tasks are completed.

I hope this gives you a brief overview of Concurrency, we would be diving deep into both threading and asyncio and how can we use async for web applications using aiohttp and quart.

Stay tuned this will be a multi-part series.

While reading about concurrency you might a lot of other topics that you might confuse concurrency with so let's look at them now just so we know how is concurrency different.

Concurrency is about dealing with lots of things at once. Parallelism is about doing lots of things at once. Not the same, but related. One is about structure, one is about execution. Concurrency provides a way to structure a solution to solve a problem that may (but not necessarily) be parallelizable. -Rob Pike

  • Parallesim: doing tasks simultaneously, this is different from concurrency as in parallelism all the tasks run side by side without waiting(sleep) for other tasks, unlike a concurrent task. The method to achieve is called multiprocessing. Multiprocessing is well suited for CPU bound tasks as it distributes tasks over different cores of the CPU. Sadly Python's GIL doesn't do go well with CPU bound tasks.

  • Single-Threaded/Multi-Threaded: Python is a single-threaded language because of the Python's GIL but you can use multiple threads. These threads run along with the main thread. So threading, in general, is the method to achieve concurrency.

  • Asynchronous:, asynchrony is used to present the idea of either concurrent or parallel task and when we talk about asynchronous execution the tasks can correspond to different threads, processes or even servers.

Part 2: Talking Concurrency: Asyncio

In the last blog I talked about Iterators and Iterables and I am assuming you're familiar with both of the concepts. So moving forward from let's talk about generators.

Simply put generators are iterators with a yield keyword and they do not return they yield. Similarly, a generator function is one that has a yield keyword in its body.

Let's look at some code and find out a bit more about them so we can define them more formally.

def range_123():
    print("Start")
    yield 1
    yield 2
    yield 3
    print('End')


for number in range_123():
    print(number)
"""
OUTPUT:
Start
1
2
3
End
"""

numbers = range_123() # Assigning generator object to numbers

next(numbers) #Output -> 1
next(numbers) #Output -> 2
next(numbers) #Output -> 3
next(numbers) #Output -> StopIteration Error

When we look closely into the above code range_123() is a generator function. Since generators are iterator we can directly iterate over the whole iterator function or we could assign it to a generator object and then use the next keyword to iterate over it until it's exhausted and raises the StopIteration error in a manner of confirming with the IteratorPrortocal.

Now you must be wondering what is the difference between the yield and return?

  • When a return statement is invoked inside a function, it permanently passes control back to the caller of the function and disposes of a function's local state.

  • When a yield is invoked, it also passes the control back to the caller of the function but it only does so temporarily. It suspends the function and retains its local state.

def greeter(name):
    while True:
        yield f'Hello {name}'

gen_object = greeter('Pradhvan') 
next(gen_object) # Output -> Hello Pradhvan
next(gen_object) # Output -> Hello Pradhvan
next(gen_object) # Output -> Hello Pradhvan
next(gen_object) # Output -> Hello Pradhvan

If we look at the above code we could clearly see that local variable are stashed away temporaily, suspending the function and giving control back to the caller while retaining it's local state.

Since it's doing a lazy evaluation it can be continued anytime with the next() on the generator, which can evaluate somewhat infinitely long series of greeting messages.

Let's look at one more example of a code snippet where multiple yield statements decide the flow of the function.

def repeater():
    while True:
        print("Start")
        yield 1
        yield 2
        print("end")
gen_obj = repeater()
next(iterator) # 1
next(iterator) # 2
next(iterator) # 3

"""
OUTPUT # 1
start 
1
OUTPUT # 2
2
OUTPUT # 3
end
start
1
"""

The above example makes it clear that in a generator function the flow of control of where the function suspends is decided by the yield statement. As the #2 suspends the value at 2 and when we do next() on 3 we get the whole block of statements.

Generator Expression

A generator function can be replaced with a generator expression. These are similar to list comprehensions which that eagerly builds a list, generator expressions return a generator that can lazily produce the items.

def range_123():
    print("Start")
    yield 1
    print("Middle")
    yield 3
    print("End")

res1 = [x*3 for x in range_123()]

"""
Output res1:
Start
Middle
End
"""

for i in res1:
    print("-->",i)
"""output:
--> 3
--> 9
"""
  • The list comprehension eagerly iterates over the items that are to be yielded and prints the Start Middle and End.
  • When the for is iterated over the list produced by the res1, it returns the item that are to be yielded.
def range_123():
    print("Start")
    yield 1
    print("Middle")
    yield 3
    print("End")

res2 = (x*3 for x in range_123())

print(res2) # <generator object <genexpr> at 0x7f8be1d09150>

for i in res2:
    print("-->",i)
"""
Output
Start
-->i
Middle
-->i
End
"""
  • In the case of generator expression, when the for loop iterates over the generator object res2, the body of the generator function range_123() actually executes.
  • Each iteration calls the next() while the iteration advances till a StopIteration is raised.

Since comprehension is a great way to increase the readability of your code and if you're using generator expression, you're making the comprehension more memory efficient.

But sometimes we tend to overuse the whole comprehension feature which backfires, I found a great article Overusing list comprehensions and generator expressions in Python which you should definitely look into.

Iteration is the fundamental technique which is supported by every programming language in the form of loops. The most common one at least from is the For loop and if specifically talk about Python's case, we have For each loop. For each loop are powered by iterators. An iterator is an object that does the actual iterating and fetches data one at a time and on-demand.

Let's take a step back and look back at some of the common terms which would help us in understanding iterators even better.

iterables: anything that can be iterated over is called an iterable.

for item in some_iterable:
    print(item)

sequences: Sequences are iterables which can be indexed.

numbers = [1,2,3,4]
tuples = (1,2,3)
word = 'Hello world'

The iter function

Iter is built-in function and whenever the interpreter needs to iterator over an object, it automatically calls the iter().

The iter() function returns an iterator.

When the iter function is called it does three things:

  1. Checks whether the object implements __iter__ method. (To see this just do dir() on the object.)
  2. If the __iter__ method is not present but the __getitem__is implemented, python creates an iterator that fetches the items in order, starting from the index zero.
  3. If that fails a TypeError is raised stating “ Object is not iterable”.
numbers = [1,2,3,4]
num = iter(numbers) # Builds an iterator 'num' 

Looking at the code snippet above we can make a better definition of an iterable.

*Any object which the __iter__ built-in function can be called an iterable.*

Before moving forward let's look at nifty little way the iter() works with functions to make them work as an iterator.

Let's build a die roller that rolls a die from 1-6 and stops when the die hits 1.

In this usage we need to make sure of two things:

  1. That the iter function must receive a callable that will be invoked every time the next function is called and the callable function should not have any arguments.
  2. The second argument which is called the sentinel which acts as a flag will cause the iterator to raise an exception instead of returning the second argument.
def die_roll():
    return randint(1,6)

roller = iter(die_roll, 1)

print(type(roller)) # <class 'callable_iterator'>

for roll in roller:
    print(roll)

"""
Output:
5
6
3
2
"""

Iterable vs Iterator

Python obtains an iterator from an iterable. Let's look at the for-each loop again to see how everything fits in the picture.

numbers = [1,2,3,4]
for number in numbers:
    print(number)

Looking at the code above we can only see the iterable i.e numbers. But what about the iterator? What about the iter() ? Isn't it suppose to use both to work.

Here we can't see the iterator or the iter() in action but it's working behind the scene. Let's re-write the whole statement in a while loop so we can see how it all fits together.

numbers = [1,2,3,4]
num = iter(numbers) # builds an iterator
while True:
    try:
        print(next(num))
    except StopIteration:
        del num
        break

The flow of the above code is simple:

  1. Iterator num is created from the iterable.
  2. To obtain the value from the iterator next is used.
  3. Iterator raises the StopIteratioin error when there are no further items left.
  4. We delete the iterator and break out of the loop.

You must be wondering everything is fine but why did we delete the iterator.

Iterators have this property that they are one-directional and once all the item is iterated over they can't be reset to the original state.

Thus the StopIteration signals that the iterator is exhausted. Thus it's best to delete it.

Writing your own iterator

Python iterator objects are required to support two methods __iter__ and the __next__ method.

iter method returns self. This allows iterators to be used where an iterable is expected i.e “for” and “in” keywords.

next method returns the next available item, raising the StopIteration when there are no more items to be looped through.

Let's bundle this knowledge and build our very own Range built-in function.

class _Range:
    def __init__(self, start, end, step = 1):
        self.start = start
        self.end = end - 1 
        self.step = step

    def __iter__(self):
        return self

    def __next__(self):
        if self.start > self.end:
            raise StopIteration
        else:
            self.start += self.step
            return self.start - 1

numbers = _Range(1, 3)
print(next(numbers)) # Result -> 1
print(next(numbers)) # Result -> 2
print(next(numbers)) # Raise a StopIteration Exception

Now that we know how an iterator works let's look back at the definition of an iterator again:

*Any object that implements the __next__ no-argument method that returns the next item in a series or raises StopIteration when there are no more items is called an Iterator.*

Just a quick tip before moving forward, the optimal way of creating your own iterator is to make a generator function, not by creating a iterator class like we did here.

Iterator Protocol

The iterator objects are required to support the following two methods, which together form the iterator protocol. The __iter__ and the __next__ method.

iterator.__iter__()
iterator.__next__()
  • Iterator Protocol powers the all the iteration in python.
  • Iterator Protocol also powers the tuple unpacking in Python.
# Tuple unpacking
x,y,z = coordinates
  • Iterator Protocol also powers the star expressions.
numbers = [1,2,3,4,5]
a,b,*rest = numbers 
print(numbers)
  • Most of the built-in functions that require some kind of looping(iterations) in python uses the Iterator Protocol.

Python's tongue twister

Iteratorables are not necessarily iterators but an iterator is necessarily iterable.

Example: Generators are iterators that can be looped over but lists are iterables but not an iterator.

Reasons to use Iterator:

  • Iterators allow lazy evaluation possible which saves memory.
  • Iterators allow for infinitely long iterables.

Not so common iterators

  • Enumerate objects are also iterators.
  • Zip objects are also iterators.
  • Reversed objects are iterators.
  • Files are also iterators.
letters = ['a','b','c','d']
next(enumerate(letters)) # Result -> (0, 'a')
next(zip(letters,letters)) #  Result -> ('a','a')
next(reversed(letters)) #  Result -> 'd'
next(open('iterator.txt')) #  Result -> 'iterator\n'

Context Managers in Python help the users to manage the resources efficiently in a program i.e opening or closing a resource or locking or unlocking a resource. Context Managers come handy when we have to manage resource before and after an event.

The most common context manager that every python programmer uses very frequently for managing files, is a with as statement.

with open('MyContextManager.txt') as f:
    f.write('Using Context Manager is FUN !')

The above code snippet has mainly two advantages:

  • Helps to close resources automatically and effectively. This is a small code block, so it could be observed easily that the file was opened and closed properly but what would when the scope of the function increases? This is where context managers really come into the picture.
  • Makes the code readable and complex logic simple. The above code can also be written as:
file = open('MyContextManager.txt','W')
try:
    file.write('Not using a Context Manager.')
finally:
    file.close()    

Here we manage the opening and closing of a file manually with a try-finally statement.

Python's standard library comes with a module, contextlib. This contains utilities for working with context managers and the with statement.

Writing Your Own Context Manager

So why would someone want to write their own Context Managers?

Because, Context Managers are best at managing resources before and after an event; thus one doesn't have to worry about the allocation/de-allocation or Locking/Unlocking or Opening/Closing of events.

Also, they make the code simple and more readable.

Writing your own context manager can be done in two ways; either create your own class or use the Contextlib module to create a Context Manager decorator.

Let's first look at how we can create a simple Context Manager class. A Context Manager class consists of two main methods enter and exit. If you're familiar with testing, you can compare these two methods with the setup and teardown.

Just like every class in Python, the init method is optional. But in the case of Context Managers, we use init only if we're using a with as statement. init has to be passed the name which you want to associate with as in the with as statement.

Now let's take a look at a simple Game of Thrones inspired ContextManager which creates a dict of the house symbols.

class ThronesContextManager:
    def __init__(self, HouseSymbol):
        self.HouseSymbol = HouseSymbol

    def __enter__(self):
        print("Enter: {}".format(self.HouseSymbol)")
        return self.HouseSymbol

    def __exit__(self, *exc):
        print("Exit: {}".format(self.HouseSymbol))


with ThronesContextManager({"House Stark": "Wolf"}) as HouseSymbol:
    HouseSymbol["Targaryen"] = "Three Headed Dragon"
    
"""
---Output---
Enter: {'House Stark': 'Wolf'}
Exit: {'House Stark': 'Wolf', 'Targaryen': 'Three Headed Dragon'}
"""
  • The init method takes in the dict associated with the as, similar to as what is done in the with-as statement. It creates an instance of the class and assigns it to the dict. Much similar to any normal Python Class.
  • The enter method is called by the with and is passed the dict. It returns the value which is associated with the dict(HouseSymbol).
  • The exit takes in the exception(*exc), these are of mainly three types exc: exception, exctype: exception type and exctb: exception_traceback.
  • If for some reason you want the program to ignore the exception you can also return True to just ignore the exception.

Now taking a look at the above code example we can say that any Context Manager has two methods an enter method and an exit method.

Before moving forward to contextmanager decorator let's break down the code snippet we saw in the starting of the post and see how it works behind the hood.

Since we know how context managers work it won't be difficult to the observe what's happening when we call with as statement while opening a file.

with open('MyContextManager.txt') as f:
    f.write('Using Context Manager is FUN !')
  1. With calls the enter method of the File class.
  2. The enter method opens the file and returns it.
  3. The opened file handle is passed to f.
  4. We write to the file using .write().
  5. The with statement calls the exit method.
  6. The exit checks for exceptions, if no exception is found it closes the file.

The easier way to write a context manager is by using the Contextlib module and creating a context manager decorator.

The good thing about using the @contextmanager is that it builds the enter and exit method for you automatically, thus we can transform a generator function into a contextmanager decorator.

Let's re-write the ThronesContextManager again but with a @ThronesContextManager.

from contextlib import contextmanager

@contextmanager
def ThronesContextManager(data):
    print("Enter: {}".format(data))
    yield data 
    print("Exit: {}".format(data))

with ThronesContextManager({"House Stark": "Wolf"}) as HouseSymbol:
    HouseSymbol["House Targaryen"] = "Three Headed Dragon"
    
"""
---Output---
Enter: {'House Stark': 'Wolf'}
Exit: {'House Stark': 'Wolf', 'House Targaryen': 'Three Headed Dragon'}
"""

PyRandom

Here are some interesting things I found about Contextmanagers. I came across these while researching for this blog post and hence the that's the reason I am adding this to the section PyRandom. I would keep updating this section as I learn more about Context Managers.

  • Context Managers do not create a separate new scope in the program i.e variables defined inside the withas block will be available after the block is executed.
with open('MyContextManager.txt') as f:
    # Variable defined inside the Context Manager
    VariableName = f.read()
print(VariableName)
  • When using multiple ContextManager in a withas statement the flow of enter and exit statement becomes LIFO(Last In First Out) i.e the enter method that is called last will have it's exit method called first.
import contextlib

@contextlib.contextmanager
def make_context(name):
    print ('entering:', name)
    yield name
    print ('exiting :', name)

with make_context('A') as A, make_context('B') as B, make_context('C') as C:
    print ('inside with statement:', A, B, C)
    
"""
---OUTPUT---
entering: A
entering: B
entering: C
inside with statement: A B C
exiting : C
exiting : B
exiting : A
"""

What now ?

Since we covered all the basic stuff on Context Managers, we can start digging deeper and learn to use Context Managers in a more realistic scenarios. So here are a few things that I would like you to read more about:

  • Learn how to handle exceptions in/with Context Managers.
  • Try to find out real project use cases where using a Context Manager would be best suited.
  • Find out the role of init and enter in the Context Manager.

Still can't get enough ?

The reason behind the blog is that I recently picked a Python problem which goes something like this

Write a Context Manager named Suppress which suppresses the exception of a given type/types i.e if the given exception type is raised, that exception should be caught and muted in a sense.

Code Example:

>>> x = 0
>>> with suppress(ValueError):
...     x = int('hello')
...
>>> x
0
>>> with suppress(ValueError, TypeError):
...     x = int(None)
...
>>> x
0

Since you read this far I am assuming you are also just starting to learn about this topic. Let's put it all that we have learned so far to a test and get some hands-on experience of writing your Context Manager. Try to solve this problem.

I am still solving the problem and once it's done I would link my solution here.

Happy Coding !