How Python runs?

First article of 2018 is up! And I decided to go with a very subtle topic this time.

Have you ever thought how the Python code is actually executed by the Python interpreter? What steps are carried out to generate the final output of your Python script? This article answers all these questions in a simplistic manner!

When the Python software is installed on your machine, minimally, it has:

  • an interpreter
  • a support library.

The interpreter

Interpreter is nothing but a software which can run your Python scripts.

Interestingly, it can be implemented in any programming language!

  • CPython is the default interpreter for Python which is written in C programming language.
  • Jython is another popular implementation of python interpreter written using Java programming language.

Image result for we got all kinds of python meme

Programmer’s view of interpreter

If you have been coding in Python for sometime, you must have heard about the interpreter  at least a few times. From a programmer’s perspective, an interpreter is simply a software which executes the source code line by line.

For most of the Python programmers, an interpreter is like a black box.

Python’s view of interpreter

Now, let us scan through the python interpreter and try to understand how it works.

Have a look at the diagram shown below:

I hope you didn’t get amazed to see a compiler inside an interpreter!

From the figure above, it can be inferred that interpreter is made up of two parts:

  • compiler
  • virtual machine 

What does compiler do?

Compiler compiles your source code (the statements in your file) into a format known as byte code. Compilation is simply a translation step!

Byte code is a:

  • lower level,
  • platform independent,
  • efficient and
  • intermediate

representation of your source code!

Roughly, each of your source statements is translated into a group of byte code instructions.

The process of compilation in CPython interpreter’s compiler can be divided into 4 main parts:

  • Parse source code into a parse tree
    Based on grammar rules of Python programming language, the source code is converted to a parse tree. Every node of the parse tree contains a part of your code.
    Consider a simple arithmetic expression:

    14 + 2 * 3 - 6 / 2

    The parse tree for above expression looks like this:

  • Transform parse tree into an Abstract Syntax Tree
    The abstract syntax tree (AST) is a high-level representation of the program structure.
    Each node of the tree denotes a construct occurring in the source code. The syntax is “abstract” in not representing every detail appearing in the real syntax.Consider the AST shown below for the parse tree example discussed above:
  • Transform AST into a Control Flow Graph
    A control flow graph is a directed graph that models the flow of a program using basic blocks. Each block contains the bytecode representation of program code inside it.

  • Byte Code generation from CFG
    CFGs are usually one step away from final code output. Code is directly generated from the basic blocks by doing a post-order depth-first search on the CFG following the edges.

This byte code translation is performed to speed up the execution—byte code can be run much quicker than the original source code statements.

Image result for code interpreter memes

What does Virtual Machine do?

As soon as source code gets converted to byte code, it is fed into PVM (Python Virtual Machine).

The PVM sounds more impressive than it is!

It’s just a big loop that iterates through your byte code instructions, one by one, to carry out their operations. The PVM is the runtime engine of Python; it’s always present as part of the Python system, and is the component that truly runs your scripts. Technically, it’s just the last step of what is called the Python interpreter.

So, this is how a python interpreter runs your python code!

Lastly, here are a few points to ponder upon:

  • PyPy is an implementation of Python which does not use an interpreter! It is implemented using something called just-in-time compiler!
    Interestingly, it often runs faster than the standard implementation of Python, CPython.
  • Whenever a Python script is executed, the byte code is generated in memory and simply discarded when program exits.
  • But, if a Python module is imported, a .pyc file for the module is generated which contains its Byte code.
    Thus, when the module is imported next time, the byte code from .pyc file is used, hence skipping the compilation step!

If you have any doubts or find anything incorrect, please share in the comments section below. Thanks for reading!  🙂


5 thoughts on “How Python runs?

  1. Byte Code generation from CFG
    CFGs are usually one step away from final code output. Code is directly generated from the basic blocks by doing a post-order depth-first search on the CFG following the edges.

    i think its a in-order traversing.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s