Talking Concurrency: Introduction

Whenever we think of programs or algorithms we think of steps that are supposed to be done one after the other to achieve a particular goal. Let's take a very simple example of a function that is supposed to greet a person:

def greeter(name):
    """Greeting function"""
    print(f"Hello {name}")

greeter(Guido) #1
greeter(Luciano) #2
greeter(Kushal) #3
"""
Output:
Hello Guido
Hello Luciano
Hello Kushal
"""

Here the function greeter() greets the person who's name is passed through it. But it does it sequentially i.e when greeter(Guido) will run the whole program will block it's state unless the function executes successfully or not. If it runs successfully then only the second and third function calls will be made.

This familiar style of programming is called sequential programming.

Why concurrency?

Sequential programming is comparatively easy to understand and most of the time fit the use case. But sometimes you need to get most out of your system for any X reason, the most common substituent of X, I could find is scaling your application.

Though greeter() is just a toy example but a real-world application with real user need to work the same even on huge amount of traffic it receives. Every time you get that spike in your traffic/daily active user you can't just add more hardware so one of the best solutions at times is to utilize your current system to the fullest. Thus Concurrency comes into the picture.

Concurrency is about dealing with lots of things at once. – Rob Pike

Challenges in writing concurrent programs

Before I move forward, I know what most of the people will say. If it's that important why at work/college/park/metro station/.. people are not talking about it? Why most of the people still use sequential programming patterns while coding?

Because of a very simple reason, it's not easy to wrap your head around and it's very easy to write sequential code pretending to be concurrent code.

concurrency-comic

I got to know about this programming style very late and later when I talked to people they said the same thing. It's not easy to code, you can easily skip the best practices and very hard to debug so most of the people try to stick to the normal style of programming.

How Python handles concurrency?

The two most popular ways(techniques) of dealing with concurrency in Python is through:

  1. Threading
  2. Asyncio

Threading: Python has a threading module that helps in writing multi-threaded code. You can spawn independent threads share common states (just like a common variable that is accessed by two independent threads).

Let's re-write that greeter() function again now with threads.

import threading 
import time
def main():
    thread1 = threading.Thread(target=greeter, args=('Guido',))
    thread2 = threading.Thread(target=greeter, args=('Luciano',))
    thread3 = threading.Thread(target=greeter, args=('Kushal',))
    thread1.start()
    thread2.start()
    thread3.start()

def greeter(name):
    print("Hello {}".format(name))
    time.sleep(1)
    
if __name__ == '__main__':
    main()

"""
Output:
Hello Guido
Hello Luciano
Hello Kushal
"""
    

Here thread1, thread2, thread3 are three independent threads that run alongside main thread of the interpreter. This may look it is running in parallel but it's not. Whenever the thread waits(here it's a simple function so you might see that), this wait can be anything reading from a socket, writing to a socket, reading from a Database. Its control is passed on to the other thread in the queue. In threading, this switching is done by the operating system(preemptive multitasking).

Though threads seem to be a good way to write multithreaded code it does have some problems too.

Asyncio: Python introduced asyncio package in 3.4, which followed a different approach of doing concurrency. It brought up the concept of coroutines. A coroutine is a restartable function that can be awaited(paused) and restarted at any given point. Unlike threads, the user decides which coroutine should be executed next. Thus this became cooperative multitasking.

Asyncio brought new keywords like async and await. A coroutine is defined with the async keyword and is awaited so that the waiting time can be utilized by the other coroutine.

Let's rewrite the greeter() again but now using the Asyncio.

import asyncio


async def greeter(name):
	await asyncio.sleep(1)
	print(f'Hello {name}')


def main():
    loop = asyncio.get_event_loop()

    task1 = loop.create_task(greeter('Guido'))
    task2 = loop.create_task(greeter('Luciano'))
    task3 = loop.create_task(greeter('Kushal'))

    final_task = asyncio.gather(task1, task2, task3)
    loop.run_until_complete(final_task)


if __name__ == '__main__':
    main()

"""
Output:
Hello Guido
Hello Luciano
Hello Kushal
"""

Looking at the above code we see some of the not so common jargons thrown around, event loop, tasks, and a new sleep function. Let's understand them before we dissect the code and understand it's working.

Now let's move on to the code. Here task1,task2 and task3 work concurrently calling the coroutine. Once all the tasked are gathered the event loop runs until all the tasks are completed.

I hope this gives you a brief overview of Concurrency, we would be diving deep into both threading and asyncio and how can we use async for web applications using aiohttp and quart.

Stay tuned this will be a multi-part series.

While reading about concurrency you might a lot of other topics that you might confuse concurrency with so let's look at them now just so we know how is concurrency different.

Concurrency is about dealing with lots of things at once. Parallelism is about doing lots of things at once. Not the same, but related. One is about structure, one is about execution. Concurrency provides a way to structure a solution to solve a problem that may (but not necessarily) be parallelizable. -Rob Pike

Part 2: Talking Concurrency: Asyncio