Using Python to Zip Files

29/12/2020
With data increasing in size exponentially every day, your concern about how to keep the said data without over-flowing your storage devices is a valid one. Compressing the files yields in much smaller files known as archives. Those archives contain all your data in its original form but with dramatically reduced sizes. This process can be done simply by right-clicking a file/bunch of files and selecting the ‘add to archive’ option. The same process can also be done through Python, and that is what we will be discussing from here.

Readying up the Python environment

To start using Python, you need to make sure that you have it correctly installed and everything related to its dependencies is up-to-date. One of the first things you should do before installing anything else is type in the following command:

sudo apt-get update

This command performs an update check for any of the packages that are installed and looks for new packages that have been added into the repositories. Once that program has run successfully, it is now time to check whether python has been installed properly. That can be confirmed by typing in the following command in the terminal window:

python3 -v

If all went according to plan, you should see a version number as the output. If you have gotten to this point without any error, you can now proceed to download an IDE to make the python experience much better. For that, we will be using Jupyter notebook. For this, you need to run the following command:

sudo apt-get -y install ipython ipython notebook && pip install jupyter

You are now sufficiently prepared to write simple Python programs that can make compressed .zip files and also unzip them as well.

To start Jupyter notebook, you need to type the following command in the terminal: jupyter notebook (Note: the IDE interface you will see in the screenshots below will be a bit different than your standard Jupyter notebook. Everything else works the same)

When the command has run, it will open the Jupyter notebook interface in your web browser, and from there you can start to code by creating a new notebook file. When that is done, you will now be able to code your archiving programs without any obstacles.

The necessary imports

Now that we have moved onto the coding part, let us begin by first doing the necessary imports we will need to make this task happen. If we are simply going to make a program that does nothing apart from making zip files from existing files, then we would just need to import two modules:

zipfile and os

The ‘zipfile’ module is pretty self-explanatory. It provides us with very easy to use functions that allow us to create zip files and add directories, files, and subdirectories to the newly created zip files with minimal effort. The module has many more uses available as well, and if you want to check out its true potential, feel free to google its documentation online.

The ‘os’ module allows us to deal with file pathing in our programs. It allows us to view directories and list their contents in the simplest ways possible. When both these libraries are used together, you will be able to easily create some very robust programs to handle the compressing tasks for you.

Zipping starts here

When you run the code in the picture below in your notebook, you will see that a zip file has been created in the directory and its size is also considerably low to the original file. The example we discussed involved just a single file.

If we want to take it a step further where we want to add directories and multiple paths to our zip file, we will need to run the following code snippet.

Again, after running the script, you will be presented with a new zip file in the directory you are currently in, and its size will be much lesser than the combined size of its contents.

Unzipping the compressed files

You might have guessed it already, and it is true. With the libraries we are using, almost all trivial tasks we can think of have been carefully crafted into one line solutions for the most part, and they are very efficient. The same is the case for unzipping the contents of the files. The same zipfile module will be used here, and after referencing the path to the file, we will simply call a function called extractall which will take in its arguments the path where we want to decompress the zip file into.

Why choose to program this?

Well, the answer to this question is very simple; automation. If you are working with large amounts of data, you will obviously need a way to shorten the manual labor involved. In 2019, if something can be automated, it should be automated. There is simply no point in doing a redundant task that can be done much more efficiently by a very cost-effective solution. The energy that would be wasted in making zip files can otherwise be used in doing something productive.

ONET IDC thành lập vào năm 2012, là công ty chuyên nghiệp tại Việt Nam trong lĩnh vực cung cấp dịch vụ Hosting, VPS, máy chủ vật lý, dịch vụ Firewall Anti DDoS, SSL… Với 10 năm xây dựng và phát triển, ứng dụng nhiều công nghệ hiện đại, ONET IDC đã giúp hàng ngàn khách hàng tin tưởng lựa chọn, mang lại sự ổn định tuyệt đối cho website của khách hàng để thúc đẩy việc kinh doanh đạt được hiệu quả và thành công.
Bài viết liên quan

Pandas read_csv Tutorial

Pandas .read_csv I have already discussed some of the history and uses for the Python library pandas.  pandas was designed...
28/12/2020

Regular Expressions using Python 3

Regular Expressions are often seen as this really obscure series of hieroglyphs that one typically copies from the Internet...
29/12/2020

Best Cloud Based IDEs for Python

Development environments are increasingly moving in the cloud in part or full, allowing programmers to access and collaborate...
29/12/2020
Bài Viết

Bài Viết Mới Cập Nhật

Dịch Vụ Xây Dựng Hệ Thống Peering Với Internet Exchange (IXP)
04/04/2025

Dịch Vụ Triển Khai VPN Site-to-Site & Remote Access
04/04/2025

Dịch Vụ Thiết Lập Hệ Thống Tường Lửa (Firewall)
04/04/2025

Dịch Vụ Triển Khai Hệ Thống Ảo Hóa & Cloud
04/04/2025

Dịch Vụ Triển Khai Hệ Thống Ceph
04/04/2025