Variables and Memory Management in Python

Variable

Variable is a way to refer to memory locations being accessed by a computer program. In languages like C, C++, Java, a variable is a name given to a memory location. So when we say:

int x;
float y;

This small piece of code asks the compiler to create memory space for the variables, 4 bytes each (the size depends on the compiler and programming language). The variables are basically names assigned to the memory location, if the variable is defined for a particular data type then only declared type data can be stored in it. But in python, we do not need to declare a variable beforehand like int x, we can straightaway move to define the variable and start using it like x = 10. alt text

Now the point is, how does Python know what is the type of the variable and how to access it? In Python, data types (integer, list, string, functions, etc. ) are objects and every object has a unique id that is assigned to it, at the time of object creation. Variables are references (pointers) to these objects. What happens when we say :

x = 10

Now if we create new variable y which is equal to x, what do you think happens internally?

y=x

y now refers to the same object as x. alt text

How do we know that? It can be verified by checking the id of the object that these two variables x and y are pointing to.

id(y) == id(x)
# output
True

We can also print the id of the object

print(f"id of the object pointed by x:{id(x)}")
print(f"id of the object pointed by y:{id(y)}") 

# output
#your id value might be different from mine but the two ids printed must be the same
id of the object pointed by x:140733799282752 
id of the object pointed by y:140733799282752

Now, what if we assign a different value to y?

y=40

This will create another integer object with the value 40 and y will now refer to 40 instead of 10. alt text How to check Again, we will use id() to see the ids of the objects.

print(x) # outputs 10
print(y) # outputs 40
print(f"id of the object pointed by x:{id(x)}")
print(f"id of the object pointed by y:{id(y)}")  

# output
id of the object pointed by x:140733799282752
id of the object pointed by y:140733799283712

As we can see from above, the two ids are now different which means a new object with value 40 is created and y refers to that object now.

If we wish to assign some other value to variable x

x = "python"
print(id(x))

# output
1783541430128

We will get an object id that is different from the id “140733799282752” printed by id(x) for integer object 10 previously.

So what happened to the integer object 10? The integer object 10 now remains unreferenced and can no longer be accessed. Python has a smart garbage collection system that looks out for objects that are no longer accessible and reclaims the memory so that it can be used for some other purpose. alt text One last question: What if we create two integer objects with the same value?

p = 100
q = 100

Before I answer this question. Let's check their ids:

print(f"id of the object pointed by x:{id(p)}")
print(f"id of the object pointed by y:{id(q)}") 
# output
id of object pointed by x:140733799285632
id of object pointed by y:140733799285632

So what python does is optimize memory allocation by creating a single integer object with value 100 and then points both the variables p and q towards it.

Let's check one more example-

m = 350
n = 350

print(f"id of object pointed by x:{id(m)}")
print(f"id of object pointed by y:{id(n)}") 

# output
id of the object pointed by x:1783572881872
id of the object pointed by y:1783572881936

To our surprise the ids of the integer object 300 referred to by m and n are different. What is going on here?? Python allocates memory for integers in the range [-5, 256] at startup, i.e., integer objects for these values are created and ids are assigned. This process is also known as integer caching. So whenever an integer is referenced in this range, python variables point to the cached value of that object. That is why the object id for integer object 100 referenced by p and q print the same ids. For integers outside the range [-5, 256], their objects are created as and when they are defined during program implementation. This is the reason why the ids for integer object 350 referenced by m and n are different. As mentioned earlier, variables point to the objects stored in memory. Hence variables are free to point to object of any type.

x = 10 # x points to an object of 'int' type
x = ["python", 20, "apple"] # x now points to an object of type 'list'
x = "python" # x now points to a string type object

I hope you enjoyed reading it. Feel free to suggest improvements to the article or any other topic that you would like to know more about.