Demystifying *.pyc files

If you have been programming in Python for a while now, you must have noticed a special type of Python files appearing every now and then, the files with .pyc extension.

In this article let’s try to demystify these *.pyc files!

But before we start, I recommend you to read this article for a better understanding of how Python runs:  How Python runs?

After reading the article given below, you will be able to understand:

  • What are pyc files?
  • When are pyc files created?
  • Can I run pyc files?
  • How to create pyc files without any import?
  • Can I decompile pyc files?
  • Should I ignore pyc files while adding code to git?

So, let’s get started!


What are pyc files?

pyc files are simply the compiled python files which contain the bytecode representation of your source code.

The only difference between python source code and bytecode is that former one is uncompiled while latter one is compiled.

This byte code translation is performed to speed up the execution—byte code can be run much quicker than the original source code statements. 


When are pyc files created?

Whenever a Python script is executed, the byte code is generated in memory and simply discarded when program exits.

But, if a Python module is imported, a .pyc file for the module is generated (by default) which contains its bytecode.
Thus, when the module is imported next time, the byte code from .pyc file is used. This makes loading of Python modules much faster because the compilation phase can be bypassed!

There’s no harm in deleting them (.pyc), but they will save compilation time if you’re doing lots of processing.

In 3.2 and later, Python saves .pyc compiled byte code files in a sub-directory named __pycache__ located in the directory where your source files reside with filenames that identify the Python version that created them (e.g. script.cpython-36.pyc)

Here is a flowchart which clears the concept:

enter image description here

Let’s try to understand this by an example.

Consider two python scripts, namely, myadd.py and test.py.
The add function of myadd.py has been imported in test.py.

Selection_006

Now, if we try to execute test.py, a folder named __pycache__ gets created in the current directory. It contains the following file:

Selection_007

Here, myadd.cpython-36.pyc is the name of generated pyc file. cpython-36 is the specification of the interpreter which created this pyc file. First few bytes of a pyc file specify the interpreter version (also called magic number).


Can I run pyc files?

Of course, you can run pyc files!

In a CPython interpreter, bytecode is fed to PVM (Python Virtual Machine) which is responsible for running your code.
pjme67t

Since, pyc files contain nothing but bytecode representation of your source code, we can execute them directly (just like the normal py files):

$ python myadd.cpython-36.pyc

Note: pyc file generated by a Python3 compiler can’t be executed using a Python2 compiler and will throw the error: RuntimeError: Bad magic number in .pyc file


How to create pyc files without any import?

If you need to create a pyc file for a module that is not imported, you can use the py_compile module.

The py_compile module can manually compile any module. One way is to use the py_compile.compile function in that module interactively:

import py_compile
py_compile.compile("myadd.py")

This will write the .pyc in the __pycache__ folder in the same location as myadd.py.

You can also automatically compile all files in a directory or directories using the compileall module.

$ python -m compileall .

Can I decompile pyc files?

Yes, pyc files can be decompiled but the generated source code may or may not be totally identical to your original source code.

Also, there is no built-in module for decompilation. For this purpose, you can use a 3rd party python package uncompyle6 .

Simply install it using:

$ pip install uncompyle6

Now, using the terminal, you can decompile any pyc file as:

$ uncompyle6 myadd.cpython-36.pyc

The obtained output looks something like this:

# uncompyle6 version 2.14.1
# Python bytecode 3.6 (3379)
# Decompiled from: Python 3.6.3 (default, Oct 3 2017, 21:45:48) 
# [GCC 7.2.0]
# Embedded file name: /home/nikhil/Desktop/myadd.py
# Compiled at: 2018-01-05 09:53:32
# Size of source mod 2**32: 72 bytes

def add(a, b):
 return a + b

if __name__ == '__main__':
 print(add(2, 3))
# okay decompiling myadd.cpython-36.pyc

Should I ignore pyc files while adding code to git?

Since pyc files can be generated automatically when you import python modules, it is useless to add them to git repositories.

In order to not share your pyc files with others, you would add the entries:

*.pyc
__pycache__

in .gitignore file and git will start ignoring any new pyc files and __pycache__  folders in the repo.

But what about those files that are already being tracked by git? To fix this,
we need to ask git to remove these paths from it’s index by running the git rm command with the --cached option.

$ git rm --cached *.pyc


So, this was all about pyc files! If you have any doubts or find anything incorrect, please share in the comments section below. Thanks for reading!  🙂

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s