In this tutorial, we will go through ZIP archive, in brief, and cover how to work efficiently with ZipFile in python. We will be using the inbuilt python library Zipfile for this and implement some simple python scripts.
What is a ZIP archive and how it works?
.ZIP files are archives that store multiple files. ZIP allows contained files to be compressed using many different methods and simply storing a file without compressing it. Each file is stored separately, allowing additional files in the same archive to be compressed using other methods. Because the ZIP archive files are compressed individually, it is possible to extract them or add new ones without applying compression or decompression to the entire archive. This contrasts with the format of compressed tar files, for which such random-access processing is not easily possible.
A directory is placed at the end of a ZIP file. This identifies what files are in the ZIP and identifies where in the ZIP that file is located. This allows ZIP readers to load the list of files without reading the entire ZIP archive. ZIP archives can also include extra data that is not related to the ZIP archive.
This allows for a ZIP archive to be made into a self-extracting archive (an application that decompresses its contained data) by prepending the program code to a ZIP archive and marking the file as executable.
Storing the catalogue at the end also makes it possible to hide a zipped file by appending it to an innocuous file, such as a GIF image file. ( This can be used to sneak in or sneak out the data 😈 )
Extracting the Data from a ZIP Archive
# Import the ZipFile class from in-built zipfile library from zipfile import ZipFile # Import os library to get current working directory import os # File location file_location = "InventoryListSheet.zip" # Get present file directory # Note: You may change the unzipping location from here unzipping_location = os.getcwd() # Now open the zip file in read mode with ZipFile(file=file_location, mode="r") as file: # lists the files present in the zip file file.printdir() # extract the zip file to the unzipping location file.extractall(path=unzipping_location) print("File(s) extracted")
The file(s) gets unzipped at the
unzipping_location. Please note that if you don’t pass the
path variable in the
extractall() function. It will get unzipped at the present working directory. The output will look like this
Create a ZIP Archive
To create a ZIP Archive we will need the path of all the files to be included in the Archive and we also need to specify the compression method.
# Import the ZipFile class from in-built zipfile library from zipfile import ZipFile, ZIP_DEFLATED # Import os library to get present working directory import os def zip_directory(path_to_be_zipped, zipf): # file are being written in zipf for root, _, files in os.walk(path_to_be_zipped): for file in files: zipf.write(os.path.join(root, file), os.path.relpath(os.path.join(root, file), os.path.join(path_to_be_zipped, '..'))) # Director to be zipped path_to_be_zipped = "path/to/folder" # Open a zip file archive in write mode # ZIP_DEFLATED - The numeric constant for the usual ZIP compression method. zipf = ZipFile('final.zip', 'w', ZIP_DEFLATED) zip_directory(path_to_be_zipped, zipf) # print all the files in the ZIP file zipf.printdir() zipf.close() print("Folder zipped")
The files in the
path_to_be_zipped will be zipped in the and the output
final.zip will appear in the current working directory.
That’s it folks for this tutorial. Please feel free to comment below if you need any help.