First article of 2018 is up! And I decided to go with a very subtle topic this time.
Have you ever thought how the Python code is actually executed by the Python interpreter? What steps are carried out to generate the final output of your Python script? This article answers all these questions in a simplistic manner!
When the Python software is installed on your machine, minimally, it has:
- an interpreter
- a support library.
Interpreter is nothing but a software which can run your Python scripts.
Interestingly, it can be implemented in any programming language!
- CPython is the default interpreter for Python which is written in C programming language.
- Jython is another popular implementation of python interpreter written using Java programming language.
Programmer’s view of interpreter
If you have been coding in Python for sometime, you must have heard about the interpreter at least a few times. From a programmer’s perspective, an interpreter is simply a software which executes the source code line by line.
For most of the Python programmers, an interpreter is like a black box.
Python’s view of interpreter
Now, let us scan through the python interpreter and try to understand how it works.
Have a look at the diagram shown below:
I hope you didn’t get amazed to see a compiler inside an interpreter!
From the figure above, it can be inferred that interpreter is made up of two parts:
- virtual machine
What does compiler do?
Compiler compiles your source code (the statements in your file) into a format known as byte code. Compilation is simply a translation step!
Byte code is a:
- lower level,
- platform independent,
- efficient and
representation of your source code!
Roughly, each of your source statements is translated into a group of byte code instructions.
The process of compilation in CPython interpreter’s compiler can be divided into 4 main parts:
- Parse source code into a parse tree
Based on grammar rules of Python programming language, the source code is converted to a parse tree. Every node of the parse tree contains a part of your code.
Consider a simple arithmetic expression:
14 + 2 * 3 - 6 / 2
The parse tree for above expression looks like this:
- Transform parse tree into an Abstract Syntax Tree
The abstract syntax tree (AST) is a high-level representation of the program structure.
Each node of the tree denotes a construct occurring in the source code. The syntax is “abstract” in not representing every detail appearing in the real syntax.Consider the AST shown below for the parse tree example discussed above:
- Transform AST into a Control Flow Graph
A control flow graph is a directed graph that models the flow of a program using basic blocks. Each block contains the bytecode representation of program code inside it.
- Byte Code generation from CFG
CFGs are usually one step away from final code output. Code is directly generated from the basic blocks by doing a post-order depth-first search on the CFG following the edges.
This byte code translation is performed to speed up the execution—byte code can be run much quicker than the original source code statements.
What does Virtual Machine do?
As soon as source code gets converted to byte code, it is fed into PVM (Python Virtual Machine).
The PVM sounds more impressive than it is!
It’s just a big loop that iterates through your byte code instructions, one by one, to carry out their operations. The PVM is the runtime engine of Python; it’s always present as part of the Python system, and is the component that truly runs your scripts. Technically, it’s just the last step of what is called the Python interpreter.
So, this is how a python interpreter runs your python code!
Lastly, here are a few points to ponder upon:
- PyPy is an implementation of Python which does not use an interpreter! It is implemented using something called just-in-time compiler!
Interestingly, it often runs faster than the standard implementation of Python, CPython.
- Whenever a Python script is executed, the byte code is generated in memory and simply discarded when program exits.
- But, if a Python module is imported, a .pyc file for the module is generated which contains its Byte code.
Thus, when the module is imported next time, the byte code from .pyc file is used, hence skipping the compilation step!
If you have any doubts or find anything incorrect, please share in the comments section below. Thanks for reading! 🙂