Tar archives

In this section, we are going to learn about the tarfile module. We'll also learn about testing the entered filename, assessing whether it's a valid archive filename or not. We'll look at how to add a new file into the already archived file, how we can read metadata using the tarfile module, and how to extract the files from an archive using the extractall() function.

First, we will test whether the entered filename is a valid archive file or not. To test this, the tarfile module has the is_tarfile() function, which returns a Boolean value.

Create a script called check_archive_file.py and write the following content in it:

import tarfile

for f_name in ['hello.py', 'work.tar.gz', 'welcome.py', 'nofile.tar', 'sample.tar.xz']:
try:
print('{:} {}'.format(f_name, tarfile.is_tarfile(f_name)))
except IOError as err:
print('{:} {}'.format(f_name, err))

Run the script and you will get the following output:

student@ubuntu:~/work$ python3 check_archive_file.py
hello.py False
work.tar.gz True
welcome.py False
nofile.tar [Errno 2] No such file or directory: 'nofile.tar'
sample.tar.xz True

So, tarfile.is_tarfile() will check every filename mentioned in the list. The hello.py, welcome.py file are not tar files so we got a Boolean value, False. work.tar.gz and sample.tar.xz are tar files, so we got the Boolean value, True. And there is no such file as nofile.tar present in our directory, so we have got an exception, as we've written it in our script.

Now, we are going to add a new file into our already created archived file. Create a script called add_to_archive.py and write the following code in it:

import shutil
import os
import tarfile
print('creating archive')
shutil.make_archive('work', 'tar', root_dir='..', base_dir='work',)
print(' Archive contents:')
with tarfile.open('work.tar', 'r') as t_file:
for names in t_file.getnames():
print(names)
os.system('touch sample.txt')
print('adding sample.txt')
with tarfile.open('work.tar', mode='a') as t:
t.add('sample.txt')
print('contents:',)
with tarfile.open('work.tar', mode='r') as t:
print([m.name for m in t.getmembers()])

Run the script and you will get the following output:

student@ubuntu:~/work$ python3 add_to_archive.py
Output :
creating archive
Archive contents:
work
work/bye.py
work/shutil_make_archive.py
work/check_archive_file.py
work/welcome.py
work/add_to_archive.py
work/shutil_unpack_archive.py
work/hello.py
adding sample.txt
contents:
['work', 'work/bye.py', 'work/shutil_make_archive.py', 'work/check_archive_file.py', 'work/welcome.py', 'work/add_to_archive.py', 'work/shutil_unpack_archive.py', 'work/hello.py', 'sample.txt']

In this example, first we created an archive file using shutil.make_archive() and then we printed the contents of the archived file. We then created a sample.txt file in the next statement. Now, we want to add that sample.txt in the already created work.tar. Here, we used the append mode, a. And next, we are again displaying the contents of the archived file.

Now, we will learn about how we can read the metadata from an archive file. The getmembers() function will load the metadata of the files. Create a script called read_metadata.py and write the following content in it:

import tarfile
import time
with tarfile.open('work.tar', 'r') as t:
for file_info in t.getmembers():
print(file_info.name)
print("Size :", file_info.size, 'bytes')
print("Type :", file_info.type)
print()

Run the script and you will get the following output:

student@ubuntu:~/work$ python3 read_metadata.py
Output:

work/bye.py
Size : 30 bytes
Type : b'0'
work/shutil_make_archive.py
Size : 243 bytes
Type : b'0'
work/check_archive_file.py
Size : 233 bytes
Type : b'0'

work/welcome.py
Size : 48 bytes
Type : b'0'

work/add_to_archive.py
Size : 491 bytes
Type : b'0'

work/shutil_unpack_archive.py
Size : 279 bytes
Type : b'0'

Now, we will extract the contents from an archive using the extractall() function. For that, create a script called extract_contents.py and write the following code in it:

import tarfile
import os
os.mkdir('work')
with tarfile.open('work.tar', 'r') as t:
t.extractall('work')
print(os.listdir('work'))

Run the script and you will get the following output:

student@ubuntu:~/work$ python3 extract_contents.py

Check your current working directory,and you will find the work/ directory. Navigate to that directory and you can find your extracted files.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset