pradhvan

Python geek who loves to play around with web technologies.

Whenever we think of programs or algorithms we think of steps that are supposed to be done one after the other to achieve a particular goal. Let's take a very simple example of a function that is supposed to greet a person:

def greeter(name):
    """Greeting function"""
    print(f"Hello {name}")

greeter(Guido) #1
greeter(Luciano) #2
greeter(Kushal) #3
"""
Output:
Hello Guido
Hello Luciano
Hello Kushal
"""

Here the function greeter() greets the person who's name is passed through it. But it does it sequentially i.e when greeter(Guido) will run the whole program will block it's state unless the function executes successfully or not. If it runs successfully then only the second and third function calls will be made.

This familiar style of programming is called sequential programming.

Why concurrency?

Sequential programming is comparatively easy to understand and most of the time fit the use case. But sometimes you need to get most out of your system for any X reason, the most common substituent of X, I could find is scaling your application.

Though greeter() is just a toy example but a real-world application with real user need to work the same even on huge amount of traffic it receives. Every time you get that spike in your traffic/daily active user you can't just add more hardware so one of the best solutions at times is to utilize your current system to the fullest. Thus Concurrency comes into the picture.

Concurrency is about dealing with lots of things at once. – Rob Pike

Challenges in writing concurrent programs

Before I move forward, I know what most of the people will say. If it's that important why at work/college/park/metro station/.. people are not talking about it? Why most of the people still use sequential programming patterns while coding?

Because of a very simple reason, it's not easy to wrap your head around and it's very easy to write sequential code pretending to be concurrent code.

concurrency-comic

I got to know about this programming style very late and later when I talked to people they said the same thing. It's not easy to code, you can easily skip the best practices and very hard to debug so most of the people try to stick to the normal style of programming.

How Python handles concurrency?

The two most popular ways(techniques) of dealing with concurrency in Python is through:

  1. Threading
  2. Asyncio

Threading: Python has a threading module that helps in writing multi-threaded code. You can spawn independent threads share common states (just like a common variable that is accessed by two independent threads).

Let's re-write that greeter() function again now with threads.

import threading 
import time
def main():
    thread1 = threading.Thread(target=greeter, args=('Guido',))
    thread2 = threading.Thread(target=greeter, args=('Luciano',))
    thread3 = threading.Thread(target=greeter, args=('Kushal',))
    thread1.start()
    thread2.start()
    thread3.start()

def greeter(name):
    print("Hello {}".format(name))
    time.sleep(1)
    
if __name__ == '__main__':
    main()

"""
Output:
Hello Guido
Hello Luciano
Hello Kushal
"""
    

Here thread1, thread2, thread3 are three independent threads that run alongside main thread of the interpreter. This may look it is running in parallel but it's not. Whenever the thread waits(here it's a simple function so you might see that), this wait can be anything reading from a socket, writing to a socket, reading from a Database. Its control is passed on to the other thread in the queue. In threading, this switching is done by the operating system(preemptive multitasking).

Though threads seem to be a good way to write multithreaded code it does have some problems too.

  • The switch between the threads during the waiting period is done by the operating system. The user does not have control over it.
  • Python has this lock called the GIL(Global Interpreter Lock) and the thread which holds the GIL can only run, others have to wait for its turn to get the GIL than only they can proceed. Which is great if you're doing an I/0 bound task but sucks if you're doing a CPU bound task.

Asyncio: Python introduced asyncio package in 3.4, which followed a different approach of doing concurrency. It brought up the concept of coroutines. A coroutine is a restartable function that can be awaited(paused) and restarted at any given point. Unlike threads, the user decides which coroutine should be executed next. Thus this became cooperative multitasking.

Asyncio brought new keywords like async and await. A coroutine is defined with the async keyword and is awaited so that the waiting time can be utilized by the other coroutine.

Let's rewrite the greeter() again but now using the Asyncio.

import asyncio


async def greeter(name):
	await asyncio.sleep(1)
	print(f'Hello {name}')


def main():
    loop = asyncio.get_event_loop()

    task1 = loop.create_task(greeter('Guido'))
    task2 = loop.create_task(greeter('Luciano'))
    task3 = loop.create_task(greeter('Kushal'))

    final_task = asyncio.gather(task1, task2, task3)
    loop.run_until_complete(final_task)


if __name__ == '__main__':
    main()

"""
Output:
Hello Guido
Hello Luciano
Hello Kushal
"""

Looking at the above code we see some of the not so common jargons thrown around, event loop, tasks, and a new sleep function. Let's understand them before we dissect the code and understand it's working.

  • Event loop: it's one of the most important parts of the async code, this is a simple code that keeps on looping and checks if anything has finished it's waiting and needs to be executed. Only a single task can be run in an event loop at a time.
  • Coroutines: here the greeter() is a coroutine which prints the greeting, though this is a simple example but in an I/0 bound process a coroutine needs to wait so await helps the program to wait and get off the event loop. The async.sleep() function is different from the time.sleep() because async.sleep() is a non blocking call i.e it does not hold the program until the execution is completed. The argument given to the async.sleep() is the at the most value of the wait.
  • Tasks: since a calling, a coroutine does not return the value of the coroutine it returns a coroutine object. Separate tasks are created that can function independently with the help of the coroutine.

Now let's move on to the code. Here task1,task2 and task3 work concurrently calling the coroutine. Once all the tasked are gathered the event loop runs until all the tasks are completed.

I hope this gives you a brief overview of Concurrency, we would be diving deep into both threading and asyncio and how can we use async for web applications using aiohttp and quart.

Stay tuned this will be a multi-part series.

While reading about concurrency you might a lot of other topics that you might confuse concurrency with so let's look at them now just so we know how is concurrency different.

Concurrency is about dealing with lots of things at once. Parallelism is about doing lots of things at once. Not the same, but related. One is about structure, one is about execution. Concurrency provides a way to structure a solution to solve a problem that may (but not necessarily) be parallelizable. -Rob Pike

  • Parallesim: doing tasks simultaneously, this is different from concurrency as in parallelism all the tasks run side by side without waiting(sleep) for other tasks, unlike a concurrent task. The method to achieve is called multiprocessing. Multiprocessing is well suited for CPU bound tasks as it distributes tasks over different cores of the CPU. Sadly Python's GIL doesn't do go well with CPU bound tasks.

  • Single-Threaded/Multi-Threaded: Python is a single-threaded language because of the Python's GIL but you can use multiple threads. These threads run along with the main thread. So threading, in general, is the method to achieve concurrency.

  • Asynchronous:, asynchrony is used to present the idea of either concurrent or parallel task and when we talk about asynchronous execution the tasks can correspond to different threads, processes or even servers.

In the last blog I talked about Iterators and Iterables and I am assuming you're familiar with both of the concepts. So moving forward from let's talk about generators.

Simply put generators are iterators with a yield keyword and they do not return they yield. Similarly, a generator function is one that has a yield keyword in its body.

Let's look at some code and find out a bit more about them so we can define them more formally.

def range_123():
    print("Start")
    yield 1
    yield 2
    yield 3
    print('End')


for number in range_123():
    print(number)
"""
OUTPUT:
Start
1
2
3
End
"""

numbers = range_123() # Assigning generator object to numbers

next(numbers) #Output -> 1
next(numbers) #Output -> 2
next(numbers) #Output -> 3
next(numbers) #Output -> StopIteration Error

When we look closely into the above code range_123() is a generator function. Since generators are iterator we can directly iterate over the whole iterator function or we could assign it to a generator object and then use the next keyword to iterate over it until it's exhausted and raises the StopIteration error in a manner of confirming with the IteratorPrortocal.

Now you must be wondering what is the difference between the yield and return?

  • When a return statement is invoked inside a function, it permanently passes control back to the caller of the function and disposes of a function's local state.

  • When a yield is invoked, it also passes the control back to the caller of the function but it only does so temporarily. It suspends the function and retains its local state.

def greeter(name):
    while True:
        yield f'Hello {name}'

gen_object = greeter('Pradhvan') 
next(gen_object) # Output -> Hello Pradhvan
next(gen_object) # Output -> Hello Pradhvan
next(gen_object) # Output -> Hello Pradhvan
next(gen_object) # Output -> Hello Pradhvan

If we look at the above code we could clearly see that local variable are stashed away temporaily, suspending the function and giving control back to the caller while retaining it's local state.

Since it's doing a lazy evaluation it can be continued anytime with the next() on the generator, which can evaluate somewhat infinitely long series of greeting messages.

Let's look at one more example of a code snippet where multiple yield statements decide the flow of the function.

def repeater():
    while True:
        print("Start")
        yield 1
        yield 2
        print("end")
gen_obj = repeater()
next(iterator) # 1
next(iterator) # 2
next(iterator) # 3

"""
OUTPUT # 1
start 
1
OUTPUT # 2
2
OUTPUT # 3
end
start
1
"""

The above example makes it clear that in a generator function the flow of control of where the function suspends is decided by the yield statement. As the #2 suspends the value at 2 and when we do next() on 3 we get the whole block of statements.

Generator Expression

A generator function can be replaced with a generator expression. These are similar to list comprehensions which that eagerly builds a list, generator expressions return a generator that can lazily produce the items.

def range_123():
    print("Start")
    yield 1
    print("Middle")
    yield 3
    print("End")

res1 = [x*3 for x in range_123()]

"""
Output res1:
Start
Middle
End
"""

for i in res1:
    print("-->",i)
"""output:
--> 3
--> 9
"""
  • The list comprehension eagerly iterates over the items that are to be yielded and prints the Start Middle and End.
  • When the for is iterated over the list produced by the res1, it returns the item that are to be yielded.
def range_123():
    print("Start")
    yield 1
    print("Middle")
    yield 3
    print("End")

res2 = (x*3 for x in range_123())

print(res2) # <generator object <genexpr> at 0x7f8be1d09150>

for i in res2:
    print("-->",i)
"""
Output
Start
-->i
Middle
-->i
End
"""
  • In the case of generator expression, when the for loop iterates over the generator object res2, the body of the generator function range_123() actually executes.
  • Each iteration calls the next() while the iteration advances till a StopIteration is raised.

Since comprehension is a great way to increase the readability of your code and if you're using generator expression, you're making the comprehension more memory efficient.

But sometimes we tend to overuse the whole comprehension feature which backfires, I found a great article Overusing list comprehensions and generator expressions in Python which you should definitely look into.

Iteration is the fundamental technique which is supported by every programming language in the form of loops. The most common one at least from is the For loop and if specifically talk about Python's case, we have For each loop. For each loop are powered by iterators. An iterator is an object that does the actual iterating and fetches data one at a time and on-demand.

Let's take a step back and look back at some of the common terms which would help us in understanding iterators even better.

iterables: anything that can be iterated over is called an iterable.

for item in some_iterable:
    print(item)

sequences: Sequences are iterables which can be indexed.

numbers = [1,2,3,4]
tuples = (1,2,3)
word = 'Hello world'

The iter function

Iter is built-in function and whenever the interpreter needs to iterator over an object, it automatically calls the iter().

The iter() function returns an iterator.

When the iter function is called it does three things:

  1. Checks whether the object implements __iter__ method. (To see this just do dir() on the object.)
  2. If the __iter__ method is not present but the __getitem__is implemented, python creates an iterator that fetches the items in order, starting from the index zero.
  3. If that fails a TypeError is raised stating “ Object is not iterable”.
numbers = [1,2,3,4]
num = iter(numbers) # Builds an iterator 'num' 

Looking at the code snippet above we can make a better definition of an iterable.

*Any object which the __iter__ built-in function can be called an iterable.*

Before moving forward let's look at nifty little way the iter() works with functions to make them work as an iterator.

Let's build a die roller that rolls a die from 1-6 and stops when the die hits 1.

In this usage we need to make sure of two things:

  1. That the iter function must receive a callable that will be invoked every time the next function is called and the callable function should not have any arguments.
  2. The second argument which is called the sentinel which acts as a flag will cause the iterator to raise an exception instead of returning the second argument.
def die_roll():
    return randint(1,6)

roller = iter(die_roll, 1)

print(type(roller)) # <class 'callable_iterator'>

for roll in roller:
    print(roll)

"""
Output:
5
6
3
2
"""

Iterable vs Iterator

Python obtains an iterator from an iterable. Let's look at the for-each loop again to see how everything fits in the picture.

numbers = [1,2,3,4]
for number in numbers:
    print(number)

Looking at the code above we can only see the iterable i.e numbers. But what about the iterator? What about the iter() ? Isn't it suppose to use both to work.

Here we can't see the iterator or the iter() in action but it's working behind the scene. Let's re-write the whole statement in a while loop so we can see how it all fits together.

numbers = [1,2,3,4]
num = iter(numbers) # builds an iterator
while True:
    try:
        print(next(num))
    except StopIteration:
        del num
        break

The flow of the above code is simple:

  1. Iterator num is created from the iterable.
  2. To obtain the value from the iterator next is used.
  3. Iterator raises the StopIteratioin error when there are no further items left.
  4. We delete the iterator and break out of the loop.

You must be wondering everything is fine but why did we delete the iterator.

Iterators have this property that they are one-directional and once all the item is iterated over they can't be reset to the original state.

Thus the StopIteration signals that the iterator is exhausted. Thus it's best to delete it.

Writing your own iterator

Python iterator objects are required to support two methods __iter__ and the __next__ method.

iter method returns self. This allows iterators to be used where an iterable is expected i.e “for” and “in” keywords.

next method returns the next available item, raising the StopIteration when there are no more items to be looped through.

Let's bundle this knowledge and build our very own Range built-in function.

class _Range:
    def __init__(self, start, end, step = 1):
        self.start = start
        self.end = end - 1 
        self.step = step

    def __iter__(self):
        return self

    def __next__(self):
        if self.start > self.end:
            raise StopIteration
        else:
            self.start += self.step
            return self.start - 1

numbers = _Range(1, 3)
print(next(numbers)) # Result -> 1
print(next(numbers)) # Result -> 2
print(next(numbers)) # Raise a StopIteration Exception

Now that we know how an iterator works let's look back at the definition of an iterator again:

*Any object that implements the __next__ no-argument method that returns the next item in a series or raises StopIteration when there are no more items is called an Iterator.*

Just a quick tip before moving forward, the optimal way of creating your own iterator is to make a generator function, not by creating a iterator class like we did here.

Iterator Protocol

The iterator objects are required to support the following two methods, which together form the iterator protocol. The __iter__ and the __next__ method.

iterator.__iter__()
iterator.__next__()
  • Iterator Protocol powers the all the iteration in python.
  • Iterator Protocol also powers the tuple unpacking in Python.
# Tuple unpacking
x,y,z = coordinates
  • Iterator Protocol also powers the star expressions.
numbers = [1,2,3,4,5]
a,b,*rest = numbers 
print(numbers)
  • Most of the built-in functions that require some kind of looping(iterations) in python uses the Iterator Protocol.

Python's tongue twister

Iteratorables are not necessarily iterators but an iterator is necessarily iterable.

Example: Generators are iterators that can be looped over but lists are iterables but not an iterator.

Reasons to use Iterator:

  • Iterators allow lazy evaluation possible which saves memory.
  • Iterators allow for infinitely long iterables.

Not so common iterators

  • Enumerate objects are also iterators.
  • Zip objects are also iterators.
  • Reversed objects are iterators.
  • Files are also iterators.
letters = ['a','b','c','d']
next(enumerate(letters)) # Result -> (0, 'a')
next(zip(letters,letters)) #  Result -> ('a','a')
next(reversed(letters)) #  Result -> 'd'
next(open('iterator.txt')) #  Result -> 'iterator\n'

Context Managers in Python help the users to manage the resources efficiently in a program i.e opening or closing a resource or locking or unlocking a resource. Context Managers come handy when we have to manage resource before and after an event.

The most common context manager that every python programmer uses very frequently for managing files, is a with as statement.

with open('MyContextManager.txt') as f:
    f.write('Using Context Manager is FUN !')

The above code snippet has mainly two advantages:

  • Helps to close resources automatically and effectively. This is a small code block, so it could be observed easily that the file was opened and closed properly but what would when the scope of the function increases? This is where context managers really come into the picture.
  • Makes the code readable and complex logic simple. The above code can also be written as:
file = open('MyContextManager.txt','W')
try:
    file.write('Not using a Context Manager.')
finally:
    file.close()    

Here we manage the opening and closing of a file manually with a try-finally statement.

Python's standard library comes with a module, contextlib. This contains utilities for working with context managers and the with statement.

Writing Your Own Context Manager

So why would someone want to write their own Context Managers?

Because, Context Managers are best at managing resources before and after an event; thus one doesn't have to worry about the allocation/de-allocation or Locking/Unlocking or Opening/Closing of events.

Also, they make the code simple and more readable.

Writing your own context manager can be done in two ways; either create your own class or use the Contextlib module to create a Context Manager decorator.

Let's first look at how we can create a simple Context Manager class. A Context Manager class consists of two main methods enter and exit. If you're familiar with testing, you can compare these two methods with the setup and teardown.

Just like every class in Python, the init method is optional. But in the case of Context Managers, we use init only if we're using a with as statement. init has to be passed the name which you want to associate with as in the with as statement.

Now let's take a look at a simple Game of Thrones inspired ContextManager which creates a dict of the house symbols.

class ThronesContextManager:
    def __init__(self, HouseSymbol):
        self.HouseSymbol = HouseSymbol

    def __enter__(self):
        print("Enter: {}".format(self.HouseSymbol)")
        return self.HouseSymbol

    def __exit__(self, *exc):
        print("Exit: {}".format(self.HouseSymbol))


with ThronesContextManager({"House Stark": "Wolf"}) as HouseSymbol:
    HouseSymbol["Targaryen"] = "Three Headed Dragon"
    
"""
---Output---
Enter: {'House Stark': 'Wolf'}
Exit: {'House Stark': 'Wolf', 'Targaryen': 'Three Headed Dragon'}
"""
  • The init method takes in the dict associated with the as, similar to as what is done in the with-as statement. It creates an instance of the class and assigns it to the dict. Much similar to any normal Python Class.
  • The enter method is called by the with and is passed the dict. It returns the value which is associated with the dict(HouseSymbol).
  • The exit takes in the exception(*exc), these are of mainly three types exc: exception, exctype: exception type and exctb: exception_traceback.
  • If for some reason you want the program to ignore the exception you can also return True to just ignore the exception.

Now taking a look at the above code example we can say that any Context Manager has two methods an enter method and an exit method.

Before moving forward to contextmanager decorator let's break down the code snippet we saw in the starting of the post and see how it works behind the hood.

Since we know how context managers work it won't be difficult to the observe what's happening when we call with as statement while opening a file.

with open('MyContextManager.txt') as f:
    f.write('Using Context Manager is FUN !')
  1. With calls the enter method of the File class.
  2. The enter method opens the file and returns it.
  3. The opened file handle is passed to f.
  4. We write to the file using .write().
  5. The with statement calls the exit method.
  6. The exit checks for exceptions, if no exception is found it closes the file.

The easier way to write a context manager is by using the Contextlib module and creating a context manager decorator.

The good thing about using the @contextmanager is that it builds the enter and exit method for you automatically, thus we can transform a generator function into a contextmanager decorator.

Let's re-write the ThronesContextManager again but with a @ThronesContextManager.

from contextlib import contextmanager

@contextmanager
def ThronesContextManager(data):
    print("Enter: {}".format(data))
    yield data 
    print("Exit: {}".format(data))

with ThronesContextManager({"House Stark": "Wolf"}) as HouseSymbol:
    HouseSymbol["House Targaryen"] = "Three Headed Dragon"
    
"""
---Output---
Enter: {'House Stark': 'Wolf'}
Exit: {'House Stark': 'Wolf', 'House Targaryen': 'Three Headed Dragon'}
"""

PyRandom

Here are some interesting things I found about Contextmanagers. I came across these while researching for this blog post and hence the that's the reason I am adding this to the section PyRandom. I would keep updating this section as I learn more about Context Managers.

  • Context Managers do not create a separate new scope in the program i.e variables defined inside the withas block will be available after the block is executed.
with open('MyContextManager.txt') as f:
    # Variable defined inside the Context Manager
    VariableName = f.read()
print(VariableName)
  • When using multiple ContextManager in a withas statement the flow of enter and exit statement becomes LIFO(Last In First Out) i.e the enter method that is called last will have it's exit method called first.
import contextlib

@contextlib.contextmanager
def make_context(name):
    print ('entering:', name)
    yield name
    print ('exiting :', name)

with make_context('A') as A, make_context('B') as B, make_context('C') as C:
    print ('inside with statement:', A, B, C)
    
"""
---OUTPUT---
entering: A
entering: B
entering: C
inside with statement: A B C
exiting : C
exiting : B
exiting : A
"""

What now ?

Since we covered all the basic stuff on Context Managers, we can start digging deeper and learn to use Context Managers in a more realistic scenarios. So here are a few things that I would like you to read more about:

  • Learn how to handle exceptions in/with Context Managers.
  • Try to find out real project use cases where using a Context Manager would be best suited.
  • Find out the role of init and enter in the Context Manager.

Still can't get enough ?

The reason behind the blog is that I recently picked a Python problem which goes something like this

Write a Context Manager named Suppress which suppresses the exception of a given type/types i.e if the given exception type is raised, that exception should be caught and muted in a sense.

Code Example:

>>> x = 0
>>> with suppress(ValueError):
...     x = int('hello')
...
>>> x
0
>>> with suppress(ValueError, TypeError):
...     x = int(None)
...
>>> x
0

Since you read this far I am assuming you are also just starting to learn about this topic. Let's put it all that we have learned so far to a test and get some hands-on experience of writing your Context Manager. Try to solve this problem.

I am still solving the problem and once it's done I would link my solution here.

Happy Coding !